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CHLAMYDIA PNEUMONIAE GENOME SEQUENCE 

CROSS-REFERENCES TO RELATED APPLICATIONS 
The present application is related to 60/128,606, filed April 8, 1999 and 
60/108,279, filed November 12, 1998, which are incorporated herein by reference. 

STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER 
FEDERALLY SPONSORED RESEARCH AND DEVELOPMENT 

FIELD OF THE INVENTION 
This invention relates to nucleic acids and polypeptides fi-om Chlamydia 
pneumoniae and to their use in the diagnosis, prevention and treatment of diseases 
associated with C pneumoniae, 

BACKGROUND OF THE INVENTION 

Chlamydiaceae is a family of obligate intracellular parasite with a tropism 
for epithelial cells lining the mucus membranes. The bacteria have two morphologically 
distinct forms, "elementary body" and "reticulate body". The elementary body is the 
infectious form, and has a rigid cell wall, primarily of cross-linked outer membrane 
proteins. The reticulate body is the intracellular, metabolically active form. A unique 
developmental cycle between these two forms characterizes Chlamydia growth. 

C. pneumoniae is a human respiratory pathogen that causes acute 
respiratory disease, and approximately 10% of community-acquired pneumonia. 
Antibody prevalence studies have shown that virtually everyone is infected with C. 
pneumoniae at some time, and that remfection is common. In addition to respiratory 
disease, studies have shown an association of this organism with coronary artery disease. 
It has been demonstrated in atherosclerotic lesions of the aorta and coronary arteries by 
immunocytochemistry and by polymerase chain reaction (Kuo et al (1993) J Infect Dis 
167(4):841-849). 

Recent reports have further demonstrated the presence of C pneumoniae 
in the walls of abdominal aortic aneurysms (Juvonen et al ( 1 997) J Vase Sure 
25(3):499-505). Abdominal aortic aneurysms are frequently associated with 
atherosclerosis, and inflammation may be an important factor in aneurysmal dilatation. 
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C pneumoniae may play a role in maintaining an inflammation and triggering the 
development of aortic aneurysms. 

Muhlestein et aL (1996) JACC 27:1555-61, reported a differential 
incidence o^^ Chlamydia species within the coronary artery wall of patients with 
5 atherosclerosis versus those with other forms of cardiovascular disease. The extremely 
high rate of possible infection in patients with symptomatic atherosclerotic disease 
compared to the very low rate in patients with normal coronary arteries or coronary artery 
disease from chronic transplant rejection provides evidence for a direct link between the 
atherosclerotic process and Chlamydia infection. Because a history of chlamydial 

10 infection is so prevalent in the population, the issue of causality remains. On a 

physiologic and pathologic level, abnormal interactions among endothelial cells, platelets, 
macrophages and lymphocytes may lead to a cascade of events resulting in acute 
endothelial damage, thrombosis and repair, chronically leading to the development of 
atheroma in blood vessels. 

15 C pneumoniae is related to other Chlamydia species, but the level of 

sequence similarity is relatively low. Very little is known about the biology of this 
organism, although it appears to be an important human pathogen. Allelic diversity and 
structural relationships between specific genes of Chlamydial species is described in 
Kaltenboeck et aL (1993) J Bacteriol 175(2):487-502; Gaydos et aL (1992) Infect Immun 

20 60(12):5319-5323; Everett et aL (1997) Int J Svst Bacteriol 47f2:>:46 1-473; and 
Pudjiatmoko et aL (1997) Int J Svst Bacteriol 47(2):425-43 1. 

A number of studies have been published describing methods for detection 
of C pneumoniae, and for distinguishing between Chlamydial species. Such methods 
include PCR detection (Rasmussen et aL (1992) Mol Cell Probes 6(5):389-394; Holland 

25 et aL (1990) J Infect Pis 162(4):984-987); a simplified polymerase chain reaction-enzyme 
immunoassay (Wilson et aL (1996) J Appl Bacteriol 80(4):43 1-438); sequence 
determination and restriction endonuclease cleavage (Herrmarm et aL (1996) J Clin 
Microbiol 34(8): 1 897- 1 902). 

Antigenic and molecular analyses of different C pneumoniae strains is 

30 described in Jantos et aL (1997) J Clin Microbiol 35(3):620-623. Some genes of C 

pneumoniae have been isolated and sequenced. These include the Gro E operon (Kikuta 
et al. (1991) Infect Immun 59(12):4665-4669); the major outer membrane protein Perez et 



2 
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a/. ( 1 99 1 ) Infect Immun 59(6):2 1 95-2 1 99; the DnaK protein homolog (Komak et aL 
(1991) Infect Immun 59(2):72I-725); as well as a number of ribosomal and other genes. 

5 

SUMMARY OF THE INVENTION 
This invention provides the genomic sequence of Chlamydia pneumoniae. 
The sequence information is useful for a variety of diagnostic and analytical methods. 
The genomic sequence may be embodied in a variety of media, including computer 
10 readable forms, or as a nucleic acid comprising a selected fragment of the sequence. 

Such fragments generally consist of an open reading frame, transcriptional or translational 
control elements, or fragments derived therefrom. Proteins encoded by the open reading 
frames are useful for diagnostic purposes, as well as for their enzymatic or structural 
activity. 

15 

DEFINITIONS 

The term "amino acid" refers to naturally occurring and synthetic amino 
acids, as well as amino acid analogs and amino acid mimetics that fimction in a manner 
similar to the naturally occurring amino acids. Naturally occurring amino acids are those 

20 encoded by the genetic code, as well as those amino acids that are later modified, e.g., 

hydroxyproline, y-carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to 
compounds that have the same basic chemical structure as a naturally occurring amino 
acid, i.e., an a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and 
an R group., e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl 

25 sulfonium Such analogs have modified R groups (e.g., norleucine) or modified peptide 
backbones, but retain the same basic chemical structure as a naturally occurring amino 
acid. Amino acid mimetics refers to chemical compounds that have a structure that is 
different from the general chemical structure of an amino acid, but that fimctions in a 
manner similar to a naturally occurring amino acid. 

30 Amino acids may be referred to herein by either their commonly known 

three letter symbols or by the one-letter symbols recommended by the lUPAC-IUB 
Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by 
their commonly accepted single-letter codes. 

3 
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"Amplification" primers are oligonucleotides comprising either natural or 
analogue nucleotides that can serve as the basis for the amplification of a select nucleic 
acid sequence. They include, e.g., polymerase chain reaction primers and ligase chain 
reaction oligonucleotides. 
5 "Antibody" refers to an immunoglobuhn molecule able to bind to a 

specific epitope on an antigen. Antibodies can be a polyclonal mixture or monoclonal. 
Antibodies can be intact immunoglobulins derived from natural sources or from 
recombinant sources and can be immunoreactive portions of intact immunoglobulins. 
Antibodies may exist in a variety of forms including, for example, Fv, Fab, and F(ab)2, as 

10 well as in single chains. Single-chain antibodies, in which genes for a heavy chain and a 
light chain are combined into a single coding sequence, may also be used. 

An "antigen" is a molecule that is recognized and bound by an antibody, 
e.g., peptides, carbohydrates, organic molecules, or more complex molecules such as 
glycolipids and glycoproteins. The part of the antigen that is the target of antibody 

1 5 binding is an antigenic determinant and a small functional group that corresponds to a 
single antigenic determinant is called a hapten. 

"Biological sample" refers to any sample obtained from a living or dead 
organism. Examples of biological samples include biological fluids and tissue specimens. 
Such biological samples can be prepared for analysis of the presence of C pneumoniae 

20 nucleic acids, proteins, or antibodies specifically reactive with the proteins. 

The term "C. pneumoniae gene*' shall be intended to mean the open 
reading frame encoding specific C. pneumoniae polypeptides, as well as adjacent 5' and 
3' non-coding nucleotide sequences involved in the regulation of expression, up to about 
2 kb beyond the coding region, but possibly further in either direction. The gene may be 

25 introduced into an appropriate vector for extrachromosomal maintenance or for 
integration into a host genome. 

"Conservatively modified variants" applies to both amino acid and nucleic 
acid sequences. With respect to particular nucleic acid sequences, conservatively 
modified variants refers to those nucleic acids which encode identical or essentially 

30 identical amino acid sequences, or where the nucleic acid does not encode an amino acid 
sequence, to essentially identical sequences. Specifically, degenerate codon substitutions 
may be achieved by generating sequences in which the third position of one or more 
selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues 

4 
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{BsLizQvetaL, Nucleic Acid Res. 19:5081 (1991); Ohtsuka a/., y. Biol. Chem. 260:2605- 
2608 (1985); Rossolini etai.Mol. Cell. Probes 8:91-98 (1994)). Because of the 
degeneracy of the genetic code, a large number of functionally identical nucleic acids 
encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all 
5 encode the amino acid alanine. Thus, at every position where an alanine is specified by a 
codon, the codon can be altered to any of the corresponding codons described without 
altering the encoded polypeptide. Such nucleic acid variations are "silent variations," 
which are one species of conservatively modified variations. Every nucleic acid sequence 
herein which encodes a polypeptide also describes every possible silent variation of the 

10 nucleic acid. One of ski ll will recognize that each codon in a nucleic acid (except AUG, 
which is ordinarily the 'jnly codon for methionine, and TGG, which is ordinarily the only 
codon for tryptophan) can be modified to yield a functionally identical molecule. 
Accordingly, each silen: variation of a nucleic acid which encodes a polypeptide is 
implicit in each describ id sequence. 

15 As to amino acid sequences, one of skill will recognize that individual 

substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein 
sequence which alters, adds or deletes a single amino acid or a small percentage of amino 
acids in the encoded sequence is a "conservatively modified variant" where the alteration 
results in the substitution of an amino acid with a chemically similar amino acid. 

20 Conservative substitution tables providing functionally similar amino acids are well 
known in the art. Such conservatively modified variants are in addition to and do not 
exclude polymorphic variants, interspecies homologs, and alleles of the invention. 

The following groups each contain amino acids that are conservative 
substitutions for one another: 



25 


1) 


Alanine (A), Glycine (G); 




2) 


Serine (S), Threonine (T); 




3) 


Aspartic acid (D), Glutamic acid (E); 




4) 


Asparagine (N), Glutamine (Q); 




5) 


Cysteine (C), Methionine (M); 


30 


6) 


Arginine (R), Lysine (K), Histidine (H); 




7) 


Isoleucine (I), Leucine (L), Valine (V); and 




8) 


Phenylalanine (F), Tyrosine (Y), Tryptophan (W). 



see, e.g., Creighton, Proteins (1984)). 
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The terms "identical" or percent "identity," in the context of two or more 
nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences 
that are the same or have a specified percentage of amino acid residues or nucleotides that 
are the same, when compared and aligned for maximum correspondence over a 
5 comparison window, as measured using one of the following sequence comparison 
algorithms or by manual alignment and visual inspection. This definition also refers to 
the complement of a test sequence, which has a designated percent sequence or 
subsequence complementarity when the test sequence has a designated or substantial 
identity to a reference sequence. For example, a designated amino acid percent identity 

10 of 95% refers to sequences or subsequences that have at least about 95% amino acid 
identity when aligned for maximum correspondence over a comparison window as 
measured using one of the following sequence comparison algorithms or by manual 
alignment and visual inspection. Such sequences would then be said to have substantial 
identity, or to be substantially identical to each other. Preferably, sequences have at least 

15 about 70%) identity, more preferably 80% identity, more preferably 90-95% identity and 
above. Preferably, the percent identity exists over a region of the sequence that is at least 
about 25 amino acids in length, more preferably over a region that is 50-100 amino acids 
in length. 

When percentage of sequence identity is used in reference to proteins or 
20 peptides, it is recognized that residue positions that are not identical often differ by 
conservative amino acid substitutions, where amino acids residues are substituted for 
other amino acid residues with similar chemical properties (e.g., charge or 
hydrophobicity) and therefore do not change the functional properties of the molecule. 
Where sequences differ in conservative substitutions, the percent sequence identity may 
25 be adjusted upwards to correct for the conservative nature of the substitution. Means for 
making this adjustment are well known to those of skill in the art. Typically this involves 
scoring a conservative substitution as a partial rather than a full mismatch, diereby 
increasing the percentage sequence identity. Thus, for example, where an identical amino 
acid is given a score of 1 and a non-conservative substitution is given a score of zero, a 
30 conservative substitution is given a score between zero and 1 . The scoring of 

conservative substitutions is calculated according to, e.g., the algorithm of Meyers & 
Miller, Computer Applic, BioL ScL 4:1 1-17 (1988) e.g., as implemented in the program 
PC/GENE (Intelligenetics, Mountain View, California, USA).. 
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For sequence comparison, typically one sequence acts as a reference 
sequence, to which test sequences are compared. When using a sequence comparison 
. algorithm, test and reference sequences are entered into a computer, subsequence 
coordinates are designated, if necessary, and sequence algorithm program parameters are 
5 designated. Default program parameters can be used, or alternative parameters can be 
designated. The sequence comparison algorithm then calculates the percent sequence 
identity for the test sequence(s) relative to the reference sequence, based on the 
designated or default program parameters. 

A comparison window includes reference to a segment of any one of the 

10 number of contiguous positions selected from the group consisting of from 25 to 600, 
usually about 50 to about 200, more usually about 100 to about 150 in which a sequence 
may be compared to a reference sequence of the same number of contiguous positions 
after the two sequences are optimally aUgned. Methods of alignment of sequences for 
comparison are well-known in the art. Optimal aligrunent of sequences for comparison 

15 can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. 
Appi Math, 2:4E2 (1981), by the homology ahgnment algorithm of Needleman 8c 
Wunsch, J. Mol Biol 48:443 (1970), by the search for similarity method of Pearson & 
Lipman, Proc. Nat'l. Acad. ScL USA 85:2444 (1988), by computerized implementations 
of these algorithms (GAP, BESTFIT, FAST A, and TFASTA in the Wisconsin Genetics 

20 Software Package, Genetics Computer Group, 575 Science Dr., Madison, WI), or by 
manual alignment and visual inspection {see, e.g., Ausubel et aL, supra). 

One example of a useftil algorithm is PILEUP. PILEUP creates a multiple 
sequence alignment from a group of related sequences using progressive, pairwise 
alignments to show relationship and percent sequence identity. It also plots a tree or 

25 dendogram showing the clustering relationships used to create the alignment. PILEUP 
uses a simplification of the progressive alignment method of Feng & Doolittle, 7. MoL 
EvoL 35:351-360 (1987). The method used is similar to the method described by Higgins 
& Sharp, 5:151-153 (1989). The program can align up to 300 sequences, each 

of a maximum length of 5,000 nucleotides or amino acids. The multiple alignment 

30 procedure begins with the pairwise alignment of the two most similar sequences, 
producing a cluster of two aligned sequences. This cluster is then aligned to the next 
most related sequence or cluster of aligned sequences. Two clusters of sequences are 
aligned by a simple extension of the pairwise alignment of two individual sequences. The 
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final alignment is achieved by a series of progressive, pairwise alignments. The program 
is run by designating specific sequences and their amino acid or nucleotide coordinates 
for regions of sequence comparison and by designating the program parameters. Using 
PILEUP, a reference sequence is compared to other test sequences to determine the 
5 percent sequence identity relationship using the following parameters: default gap weight 
(3.00), default gap length weight (0. 10), and weighted end gaps. PILEUP can be obtained 
from the GCG sequence analysis software package, e.g, version 7.0 (Devereaux et ai, 
Nuc. Acids Res. 12:387-395 (1984). 

Another example of algorithm that is suitable for determining percent 

10 sequence identity (i.e., substantial similarity or identity) is the BLAST algorithm, which 
is described in Altschul et ai, J, Mol Biol. 215:403-410 (1990). Software for performing 
BLAST analyses is publicly available through the National Center for Biotechnology 
Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying 
high scoring sequence pairs (HSPs) by identifying short words of length W in the query 

1 5 sequence, which either match or satisfy some positive-valued threshold score T when 
aligned with a word of the same length in a database sequence. T is referred to as the 
neighborhood word score threshold (Altschul et al^ supra). These initial neighborhood 
word hits act as seeds for initiating searches to find longer HSPs containing them. The 
word hits are then extended in both directions along each sequence for as far as the 

20 cumulative alignment score can be increased. Cumulative scores are calculated using, for 
nucleotide sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues, always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of the 
word hits in each direction are halted when: the cumulative alignment score falls off by 

25 the quantity X from its maximum achieved value; the cumulative score goes to zero or 
below, due to the accumulation of one or more negative-scoring residue alignments; or 
the end of either sequence is reached. The BLAST algorithm parameters W, T, and X 
determine the sensitivity and speed of the alignment. The BLASTN program (for 
nucleotide sequences) uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, 

30 M=5, N=4, and a comparison of both strands. For amino acid sequences, the BLASTP 
program uses as default parameters a wordlength (W) of 3, an expectation (E) of 10, and 
the BLOSUM62 scoring matrix {see Henikoff & Henikoff, Proc. NatL Acad. ScL USA 
89:10915 (1989)). 
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The BLAST algorithm also performs a statistical analysis of the similarity 
between two sequences {see, e.g., Karlin & Altschul, Proc, Nat'L Acad. ScL USA 
90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is 
the smallest sum probability (P(N)), which provides an indication of the probability by 
5 which a match between two nucleotide or amino acid sequences would occur by chance. 
For example, a nucleic acid is considered similar to a reference sequence if the smallest 
sum probability in a comparison of the test nucleic acid to the reference nucleic acid is 
less than about 0.1, more preferably less than about 0.01, and most preferably less than 
about O.OOL 

10 An indication that two nucleic acid sequences or polypeptides are 

substantially identical is that the polypeptide encoded by the first nucleic acid is 
immunologically cross /eactive with the antibodies raised against the polypeptide 
encoded by the second nucleic acid, as described below. Thus, a polypeptide is typically 
substantially identical to a second polypeptide, for example, where the two peptides differ 

15 only by conservative suostitutions. Another indication that two nucleic acid sequences 
are substantially identical is that the two molecules or their complements hybridize to 
each other under stringent conditions, as described below. 

Another indication that polynucleotide sequences are substantially 
identical is if two molecules hybridize to each other under stringent conditions. Stringent 

20 conditions are sequence dependent and will be different in different circumstances. 
Generally, stringent conditions are selected to be about 5°C lower than the thermal 
melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm 
is the temperature (under defined ionic strength and pH) at which 50% of the target 
sequence hybridizes to a perfectly matched probe. Typically stringent conditions for a 

25 Southern blot protocol involve hybridizing in a buffer comprising 5x SSC, 1% SDS at 
65°C or hybridizing in a buffer containing 5x SSC and 1% SDS at 42°C and washing at 
65°C with a 0.2x SSC, 0.1% SDS wash, 

A "label" is a composition detectable by spectroscopic, photochemical, 
biochemical, immunochemical, or chemical means. For example, useful labels include 

30 ^^P, fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an 
ELIS A), biotin, dioxigenin, or haptens and proteins for which antisera or monoclonal 
antibodies are available. 



9 
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The term '*nucieic acid" refers to deoxyribonucleotides or ribonucleotides 
and polymers thereof in either single- or double-stranded fomi. The term encompasses 
nucleic acids containing known nucleotide analogs or modified backbone residues or 
linkages, which are synthetic, naturally occurring, and non-naturally occurring, which 
5 have similar binding properties as the reference nucleic acid, and which are metabolized 
in a manner similar to the reference nucleotides. Examples of such analogs include, 
without limitation, phosphorothioates, phosphoramidates, methyl phosphonates, 
chiral-methyl phosphonates, 2-0-methyl ribonucleotides, peptide-nucleic acids (PNAs). 
Unless otherwise indicated, a particular nucleic acid sequence also 

10 implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon 
substitutions) and complementary sequences, as well as the sequence explicitly indicated. 
The term nucleic acid is used interchangeably with gene, cDNA, mRNA, oligonucleotide, 
and polynucleotide. 

As used herein a "nucleic acid probe or oligonucleotide" is defined as a 

15 nucleic acid capable of binding to a target nucleic acid of complementary sequence 
through one or more types of chemical bonds, usually through complementary base 
pairing, usually through hydrogen bond formation. As used herein, a probe may include 
natural (i.e.. A, G, C, or T) or modified bases (7-deazaguanosine, inosine, etc.). In 
addition, the bases in a probe may be joined by a linkage other than a phosphodiester 

20 bond, so long as it does not interfere with hybridization. Thus, for example, probes may 
be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather 
than phosphodiester linkages. It will be understood by one of skill in the art that probes 
may bind target sequences lacking complete complementarity with the probe sequence 
depending upon the stringency of the hybridization conditions. The probes are preferably 

25 directly labeled as with isotopes, chromophores, lumiphores, chromogens, or indirectly 
labeled such as with biotin to which a streptavidin complex may later bind. By assaying 
for the presence or absence of the probe, one can detect the presence or absence of the 
select sequence or subsequence. 

A labeled nucleic acid probe or oligonucleotide is one that is bound, either 

30 covalently, through a linker, or through ionic, van der Waals or hydrogen bonds to a label 
such that the presence of the probe may be detected by detecting the presence of the label 
bound to the probe. 



10 
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'Tharmaceutically acceptable" means a material that is not biologically or 
otherwise undesirable, i.e., the material can be administered to an individual along with a 
Chlamydia antigen without causing any undesirable biological effects or interacting in a 
deleterious manner with any of the other components of the pharmaceutical composition. 
5 The terms "polypeptide," "peptide" and "protein" are used interchangeably 

herein to refer to a polymer of amino acid residues. The terms apply to amino acid 
polymers in which one or more amino acid residue is an analog or mimetic of a 
corresponding naturally occurring amino acid, as well as to naturally occurring amino 
acid polymers. 

10 The phrase "specifically or selectively hybridizing to," refers to 

hybridization between a probe and a target sequence in which the probe binds 
substantially only to the target sequence, forming a hybridization complex, when the 
target is in a heterogeneous mixture of polynucleotides and other compounds. Such 
hybridization is determinative of the presence of the target sequence. Although the probe 

15 may bind other unrelated sequences, at least 90%, preferably 95% or more of the 
hybridization complexes formed are with the target sequence. 

The term "recombinant" when used with reference to a cell, or nucleic 
acid, or vector, indicates that the cell, or nucleic acid, or vector, has been modified by the 
introduction of a heterologous nucleic acid or the alteration of a native nucleic acid, or 

20 that the cell is derived from a cell so modified. Thus, for example, recombinant cells 
express genes that are not found within the native (non-recombinant) form of the cell or 
express native genes that are otherwise abnormally expressed, under expressed or not 
expressed at all. 

The phrase "specifically immunoreactive with", when referring to a protein 
25 or peptide, refers to a binding reaction between the protein and an antibody which is 

determinative of the presence of the protein in the presence of a heterogeneous population 
of proteins and other compounds. Thus, under designated immunoassay conditions, the 
specified antibodies bind to a particular protein and do not bind in a significant amount to 
other proteins present in the sample. Specific binding to an antibody under such 
30 conditions may require an antibody that is selected for its specificity for a particular 

protein. A variety of immunoassay formats may be used to select antibodies specifically 
immunoreactive with a particular protein and are described in detail below. 
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The phrase "substantially pure'* or "isolated" when referring to a 
Chlamydia peptide or protein, means a chemical composition which is free of other 
subcellular components of the Chlamydia organism. Typically, a monomeric protein is 
substantially pure when at least about 85% or more of a sample exhibits a single 
5 polypeptide backbone. Minor variants or chemical modifications may typically share the 
same polypeptide sequence. Depending on the purification procedure, purities of 85%, 
and preferably over 95% pure are possible. Protein purity or homogeneity may be 
indicated by a number of means well known in the art, such as polyacrylamide gel 
electrophoresis of a protein sample, followed by visualizing a single polypeptide band on 
1 0 a polyacrylamide gel upon silver staining. For certain purposes high resolution will be 
needed and HPLC or a similar means for purification utilized. 

DETAILED DESCRIPTION 
The present invention provides the nucleotide sequence of the C. 

15 pneumoniae genome SEQ ID NO: 1 or a representative fragment thereof, in a form which 
can be readily used, analyzed, and interpreted by a skilled artisan. As used herein, a 
"representative fragment" of the nucleotide sequence depicted in SEQ ID NO: 1 refers to 
any portion which is not presently represented within a publicly available database. 
Preferred representative fragments of the present invention are open reading frames, 

20 expression modulating fragments, uptake modulating fragments, and fragments which can 
be used to diagnose the presence of C pneumoniae in sample. Using the information 
provided in the present application, together with routine cloning and sequencing 
methods, one of ordinary skill in the art will be able to clone and sequence all 
"representative fragments" of interest including open reading frames (ORFs) encoding a 

25 large variety of C. pneumoniae proteins. A non-limiting identification of such preferred 
representative fragments is provided in Tables 2 and 3. 

Diagnostic use of C. pneumoniae nucleic acids 

Hvbridization-based assavs 

Using the nucleic acids disclosed here, one of skill can design nucleic acid 
30 hybridization-based assays for the detection of C. pneumoniae. Any of a number of well 
known techniques for the specific detection of target nucleic acids can be used. 
Exemplary hybridization-based assays include, but are not limited to, traditional "direct 
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probe" methods such as Southern Blots, dot blots, in situ /jybndization (e.g., FISH), PGR, 
and the like. The methods can be used in a wide variety of formats including, but not 
limited to substrate- (e.g. membrane or glass) bound methods or array-based approaches 
as described below. As noted above, this invention also embraces methods for detecting 
5 the presence of Chlamydia DNA or RNA in biological samples. These sequences can be 
used to detect Chlamydia in biological samples from patients suspected of being infected. 
A variety of methods of specific DNA and RNA measurement using nucleic acid 
hybridization techniques are known to those of skill in the an {see Sambrook et ai, 
supra), 

10 In situ hybridization assays are well known {e.g.. Angerer (1987) Meth. 

Enzymol 152: 649). Generally, in situ hybridization comprises the following major steps: 
(1) fixation of tissue or biological structure to analyzed; (2) prehybridization treatment of 
the biological structure t^ increase accessibility of target DNA, and to reduce nonspecific 
binding; (3) hybridizatic n of the mixture of nucleic acids to the nucleic acid in the 

15 biological structure or tissue; (4) post-hybridization washes to remove nucleic acid 

fragments not bound in the hybridization and (5) detection of the hybridized nucleic acid 
fragments. The reagent used in each of these steps and the conditions for use vary 
depending on the particular application. 

In a typical in situ hybridization assay, cells are fixed to a soUd support, 

20 typically a glass sUde. If a nucleic acid is to be probed, the cells are typically denatured 
with heat or alkali. The cells are then contacted with a hybridization solution at a 
moderate temperature to permit annealing of labeled probes specific to the nucleic acid 
sequence encoding the protein. The targets {e.g., cells) are then typically washed at a 
predetermined stringency or at an increasing stringency until an appropriate signal to 

25 noise ratio is obtained. 

The nucleic acids of this invention are particularly well suited to array- 
based hybridization formats. Arrays are a multiplicity of different "probe" or "target" 
nucleic acids (or other compounds) attached to one or more surfaces {e.g., solid, 
membrane, or gel). In a preferred embodiment, the multiplicity of nucleic acids (or other 

30 moieties) is attached to a single contiguous surface or to a multiplicity of surfaces 
juxtaposed to each other. 

In an array format a large number of different hybridization reactions can 
be run essentially "in parallel." This provides rapid, essentially simultaneous, evaluation 
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of a number of hybridizations in a single "experiment". Methods of performing 
hybridization reactions in array based fonnats are well known to those of skill in the art 
{see, e.g., Pastinen (1997) Genome Res. 7; 606-614; Jackson (1996) Nature 
Biotechnology 14:1685; Chee (1995) Science 274: 610; WO 96/17958. 
5 Arrays, particularly nucleic acid arrays can be produced according to a 

wide variety of methods well known to those of skill in the art. For example, in a simple 
embodiment, "low density" arrays can simply be produced by spotting {e.g. by hand using 
a pipette) different nucleic acids at different locations on a solid support {e.g. a glass 
surface, a membrane, etc.). 

10 This simple spotting, approach has been automated to produce high 

density spotted arrays {see, e.g., U.S. Patent No: 5,807,522). This patent describes the 
use of an automated systems that taps a microcapillary against a surface to deposit a small 
volume of a biological sample. The process is repeated to generate high density arrays. 
Arrays can also be produced using oligonucleotide synthesis technology. Thus, for 

15 example, U.S. Patent No. 5,143,854 and PCT patent publication Nos, WO 90/15070 and 
92/10092 teach the use of light-directed combinatorial synthesis of high density 
oligonucleotide arrays. 

Many methods for immobilizing nucleic acids on a variety of solid 
surfaces are known in the art. A wide variety of organic and inorganic polymers, as well 

20 as other materials, both natural and synthetic, can be employed as the material for the 
solid surface. Illustrative solid surfaces include, e.g., nitrocellulose, nylon, glass, quartz, 
diazotized membranes (paper or nylon), silicones, polyformaldehyde, cellulose, and 
cellulose acetate. In addition, plastics such as polyethylene, polypropylene, polystyrene, 
and the like can be used. Other materials which may be employed include paper, 

25 ceramics, metals, metalloids, semiconductive materials, cermets or the like. In addition, 
substances that form gels can be used. Such materials include, e.g., proteins (e.g., 
gelatins), lipopolysaccharides, silicates, agarose and polyacrylaxnides. Where the solid 
surface is porous, various pore sizes may be employed depending upon the nature of the 
system. 

30 In preparing the surface, a plurality of different materials may be 

employed, particularly as laminates, to obtain various properties. For example, proteins 
{e.g., bovine serum albumin) or mixtures of macromolecules {e.g., Denhardt's solution) 
can be employed to avoid non-specific binding, simplify covalent conjugation, enhance 
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signal detection or the like. If covalent bonding between a compound and the surface is 
desired, the surface will usually be polyftinctional or be capable of being 
polyfunctionalized. Functional groups which may be present on the surface and used for 
linking can include carboxylic acids, aldehydes, amino groups, cyano groups, ethylenic 
5 groups, hydroxy 1 groups, mercapto groups and the like. The manner of linking a wide 
variety of compounds to various surfaces is well known and is amply illustrated in the 
literature. 

For example, methods for immobilizing nucleic acids by introduction of 
various functional groups to the molecules is known {see, e.g., Bischoff {\9%l)AnaL 

10 Biochem., 164: 336-344; Kremsky (1987) NucL Acids Res, 15: 2891-2910). Modified 
nucleotides can be placed on the target using PGR primers containing the modified 
nucleotide, or by enzymatic end labeling with modified nucleotides. Use of glass or 
membrane supports (e.g., nitrocellulose, nylon, polypropylene) for the nucleic acid arrays 
of the invention is advantageous because of well developed technology employing 

15 manual and robotic methods of arraying targets at relatively high element densities. Such 
membranes are generally available and protocols and equipment for hybridization to 
membranes is well known. 

Target elements of various sizes, ranging from 1 mm diameter down to 1 
fim can be used. Smaller target elements containing low amounts of concentrated, fixed 

20 probe DNA are used for high complexity comparative hybridizations since the total 

amount of sample available for binding to each target element will be limited. Thus it is 
advantageous to have small array target elements that contain a small amount of 
concentrated probe DNA so that the signal that is obtained is highly localized and bright. 
Such small array target elements are typically used in arrays with densities greater than 

25 lOtlzm^, Relatively simple approaches capable of quantitative fluorescent imaging of 1 
cm^ areas have been described that permit acquisition of data from a large number of 
target elements in a single image (see, e.g., Wittrup (1994) Cytometry 16:206-213). 

If fluorescently labeled nucleic acid samples are used, arrays on solid 
surface substrates with much lower fluorescence than membranes, such as glass, quartz, 

30 or small beads, can achieve much better sensitivity. Substrates such as glass or fused 
silica are advantageous in that they provide a very low fluorescence substrate, and a 
highly efficient hybridization environment. Covalent attachment of the target nucleic 
acids to glass or synthetic fused silica can be accomplished according to a number of 
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known techniques (described above). Nucleic acids can be conveniently coupled to glass 
using commercially available reagents. For instance, materials for preparation of 
silanized glass with a number of functional groups are commercially available or can be 
prepared using standard techniques (see, e.g.. Gait (1984) Oligonucleotide Synthesis: A 
5 Practical Approach, IRL Press, Wash., D.C.). Quartz cover slips, which have at least 10- 
fold lower auto fluorescence than glass, can also be silanized. 

Alternatively, probes can also be immobilized on commercially available 
coated beads or other surfaces. For instance, biotin end-labeled nucleic acids can be 
boimd to commercially available avidin-coated beads. Streptavidin or anti-digoxigenin 

10 antibody can also be attached to silanized glass slides by protein-mediated coupling using 
e.g., protein A following standard protocols (see, e.g., Smith (1992) Science 258: 1 122- 
1 126). Biotin or digoxigenin end-labeled nucleic acids can be prepared according to 
standard techniques. Hybridization to nucleic acids attached to beads is accomplished by 
suspending them in the hybridization mix, and then depositing them on the glass substrate 

15 for analysis after weishing. Alternatively, paramagnetic particles, such as ferric oxide 
particles, with or without avidin coating, can be used. 

A variety of other nucleic acid hybridization formats are known to those 
skilled in the art. For example, common formats include sandwich assays and 
competition or displacement assays. Hybridization techniques are generally described in 

20 Hames and Higgins (1985) Nucleic Acid Hybridization, A Practical Approach, IRL Press; 
Gall and Pardue (1969) Proc. Natl. Acad ScL USA 63: 378-383; and John et al. (1969) 
Nature 223: 582-587. 

Sandwich assays are commercially useful hybridization assays for 
detecting or isolating nucleic acid sequences. Such assays utilize a "capture" nucleic acid 

25 covalently immobilized to a solid support and a labeled "signal" nucleic acid in solution. 
The sample will provide the target nucleic acid. The "capture" nucleic acid and "signal" 
nucleic acid probe hybridize with the target nucleic acid to form a "sandwich" 
hybridization complex. To be most effective, the signal nucleic acid should not hybridize 
with the capture nucleic acid. 

30 Detection of a hybridization complex may require the binding of a signal 

generating complex to a duplex of target and probe polynucleotides or nucleic acids. 
Typically, such binding occurs through ligand and anti-ligand interactions as between a 
ligand-conjugated probe and an anti-ligand conjugated with a signal. 
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The sensitivity of the hybridization assays may be enhanced through use of 
a nucleic acid amplification system that multiplies the target nucleic acid being detected. 
Examples of such systems include the polymerase chain reaction (PGR) system and the 
ligase chain reaction (LCR) system. Other methods recently described in the art are the 
5 nucleic acid sequence based amplification (NASBAO, Cangene, Mississauga, Ontario) 
and Q Beta Replicase systems. 

Nucleic acid hybridization simply involves providing a denatured probe 
and target nucleic acid under conditions where the probe and its complementary target 
can form stable hybrid duplexes through complementary base pairing. The nucleic acids 

10 that do not form hybrid duplexes are then washed away leaving the hybridized nucleic 
acids to be detected, tyi:icaily through detection of an attached detectable label. It is 
generally recognized that nucleic acids are denatured by increasing the temperature or 
decreasing the salt concentration of the buffer containing the nucleic acids, or in the 
addition of chemical agents, or the raising of the pH. Under low stringency conditions 

15 (e^., low temperature and/or high salt and/or high target concentration) hybrid duplexes 
{e.g., DNA:DNA, RNA:RNA, or RNAiDNA) will form even where the annealed 
sequences are not perfectly complementary. Thus specificity of hybridization is reduced 
at lower stringency. Conversely, at higher stringency (e.g., higher temperature or lower 
salt) successful hybridization requires fewer mismatches. 

20 One of skill in the art will appreciate that hybridization conditions may be 

selected to provide any degree of stringency. In a preferred embodiment, hybridization is 
performed at low stringency to ensure hybridization and then subsequent washes are 
performed at higher stringency to eliminate mismatched hybrid duplexes. Successive 
washes may be performed at increasingly higher stringency (e.g., down to as low as 0.25 

25 X SSPE-T at 37°C to 70°C) until a desired level of hybridization specificity is obtained. 
Stringency can also be increased by addition of agents such as formamide. Hybridization 
specificity may be evaluated by comparison of hybridization to the test probes with 
hybridization to the various controls that can be present 

In general, there is a tradeoff between hybridization specificity 

30 (stringency) and signal intensity. Thus, in a preferred embodiment, the wash is performed 
at the highest stringency that produces consistent results and that provides a signal 
mtensity greater than approximately 10% of the backgroimd intensity. Thus, in a 
preferred embodiment, the hybridized array may be washed at successively higher 

17 



wo 00/27994 



PCT/US99/26923 



stnngency solutions and read between each wash. Analysis of the data sets thus produced 
will reveal a wash stringency above which tke hybridization pattern is not appreciably 
altered and which provides adequate signal for the particular probes of interest. 

Methods of optimizing hybridization conditions are well known to those of 
5 skill in the art {see, e.g., Tijssen (1993) Laboratory Techniques in Biochemistry and 
Molecular Biology, Vol. 24: Hybridization With Nucleic Acid Probes, Elsevier, N.Y.). 

Labeling and detection of nucleic acids. 

In a preferred embodiment, the hybridized nucleic acids are detected by 
detecting one or more labels attached to the sample or probe nucleic acids. The labels 

10 may be incorporated by any of a number of means well known to those of skill in the art. 
Means of attaching labels to nucleic acids include, for example nick translation or end- 
labeling (e.g. with a labeled RNA) by kinasing of the nucleic acid and subsequent 
attachment (ligation) of a nucleic acid linker joining the sample nucleic acid to a label 
(e.g., a fluorophore). A wide variety of linkers for the attachment of labels to nucleic 

15 acids are also known. In addition, intercalating dyes and fluorescent nucleotides can also 
be used. 

Detectable labels suitable for use in the present invention include any 
composition detectable by spectroscopic, photochemical, biochemical, immunochemical, 
electrical, optical or chemical means. Useful labels in the present invention include biotin 

20 for staining with labeled streptavidin conjugate, magnetic beads (e.g., Dynabeads™), 
fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and 
the like, see, e.g., Molecular Probes, Eugene, Oregon, USA), radiolabels (e.g., ^H, 
^^S, '"^C, or ^^P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others 
commonly used in an ELISA), and colorimetric labels such as colloidal gold (e.g., gold 

25 particles in the 40 -80 rmi diameter size range scatter green light with high efficiency) or 
colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads. Patents 
teaching the use of such labels include U.S. Patent Nos. 3,817,837; 3,850,752; 3,939,350; 
3,996,345; 4,277,437; 4,275,149; and 4,366,241. 

A fluorescent label is preferred because it provides a very strong signal 

30 with low background. It is also optically detectable at high resolution and sensitivity 
through a quick scanning procedure. The nucleic acid samples can all be labeled with a 
single label, e.g., a single fluorescent label. Alternatively, in another embodiment, 
different nucleic acid samples can be simultaneously hybridized where each nucleic acid 
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sample has a different label. For instance, one target could have a green fluorescent label 
and a second target could have a red fluorescent label. The scanning step will distinguish 
cites of binding of the red label from those binding the green fluorescent label. Each 
nucleic acid sample (target nucleic acid) can be analyzed independently from one another. 
5 Suitable chromogens v^hich can be employed include those molecules and 

compounds which absorb light in a distinctive range of wavelengths so that a color can be 
observed or, alternatively, which emit light when irradiated with radiation of a particular 
wave length or wave length range, e.g., fluorescers. 

Desirably, fluorescers should absorb light above about 300 nm, preferably 

10 about 350 nm, and more preferably above about 400 nm, usually emitting at wavelengths 
greater than about 10 nm higher than the wavelength of the light absorbed. It should be 
noted that the absorption and emission characteristics of the bound dye can differ from 
the unbound dye. Therefore, when referring to the various wavelength ranges and 
characteristics of the dyes, it is intended to indicate the dyes as employed and not the dye 

15 which is unconjugated and characterized in an arbitrary solvent. 

Fluorescers are generally preferred because by irradiating a fluorescer with 
light, one can obtain a plurality of emissions. Thus, a single label can provide for a 
plurality of measurable events. 

Detectable signal can also be provided by chemiluminescent and 

20 bioluminescent sources. Chemiluminescent sources include a compound which becomes 
electronically excited by a chemical reaction and can then emit light which serves as the 
detectable signal or donates energy to a fluorescent acceptor. Alternatively, luciferins can 
be used in conjunction with luciferase or lucigenins to provide bioluminescence. 
Spin labels are provided by reporter molecules with an unpaired electron spin which can 

25 be detected by electron spin resonance (ESR) spectroscopy. Exemplary spin labels 
include organic free radicals, transitional metal complexes, particularly vanadium, 
copper, iron, and manganese, and the like. Exemplary spin labels include nitroxide free 
radicals. 

The label may be added to the target (sample) nucleic acid(s) prior to, or 
30 after the hybridization. So called "direct labels" are detectable labels that are directly 

attached to or incorporated into the target (sample) nucleic acid prior to hybridization. In 
contrast, so called "indirect labels" are joined to the hybrid duplex after hybridization. 
Often, the indirect label is attached to a binding moiety that has been attached to the 

19 



wo 00/27994 



PCT/US99/26923 



target nucleic acid prior to the hybridization. Thus, for example, the target nucleic acid 
may be biotinylated before the hybridization. Afler hybridization, an avidin-conjugated 
fluorophore will bind the biotin bearing hybrid duplexes providing a label that is easily 
detected. For a detailed review of methods of labeling nucleic acids and detecting labeled 
5 hybridized nucleic acids see Laboratory Techniques in Biochemistry and Molecular 

Biology, Vol. 24: Hybridization With Nucleic Acid Probes, P. Tijssen, ed. Elsevier, N.Y., 
(1993)). 

Fluorescent labels are easily added during an in vitro transcription 
reaction. Thus, for example, fluorescein labeled UTP and CTP can be incorporated into 

10 the RNA produced in an in vitro transcription. 

The labels can be attached directly or through a linker moiety. In general, 
the site of label or linker-label attachment is not limited to any specific position. For 
example, a label may be attached to a nucleoside, nucleotide, or analogue thereof at any 
position that does not interfere with detection or hybridization as desired. For example, 

15 certain Label-ON Reagents from Clontech (Palo Alto, CA) provide for labeling 

interspersed throughout the phosphate backbone of an oligonucleotide and for terminal 
labeling at the 3' and 5* ends. As shown for example herein, labels can be attached at 
positions on the ribose ring or the ribose can be modified and even eliminated as desired. 
The base moieties of useful labeling reagents can include those that are naturally 

20 occurring or modified in a manner that does not interfere with the purpose to which they 
are put. Modified bases include but are not limited to 7-deaza A and G, 7-deaza-8-aza A 
and G, and other heterocyclic moieties. 

It will be recognized that fluorescent labels are not to be limited to single 
species organic molecules, but include inorganic molecules, multi-molecular mixtures of 

25 organic and/or inorganic molecules, crystals, heteropolymers, and the like. Thus, for 
example, CdSe-CdS core-shell nanocrystals enclosed in a silica shell can be easily 
derivatized for coupling to a biological molecule (Bruchez et al. (1998) Science, 281 : 
2013-2016). Similarly, highly fluorescent quantum dots (zinc sulfide-capped cadmium 
selenide) have been covalently coupled to biomolecules for use in ultrasensitive 

30 biological detection (Warren and Nie (1998) Science, 281: 2016-2018). 

Amplification-based assays. 

In another embodiment, amplification-based assays can be used to detect 
nucleic acids. In such amplification-based assays, the nucleic acid sequences act as a 
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template in an amplification reaction (e.g. Polymerase Chain Reaction (PGR). Detailed 
protocols for quantitative PGR are provided in Innis et ai (1990) PCR Protocols, A Guide 
to Methods and Applications, Academic Press, Inc. N.Y.). 

Other suitable amplification methods include, but are not limited to ligase 
5 chain reaction (LCR) {see Wu and Wallace (1989) Genomics 4: 560, Landegren et ai 
(1988) Science 241: 1077, and Bamnger et ai (1990) Gene 89: 1 17, transcription 
amplification (Kwoh et ai (1989) Proc. Nad. Acad, ScL USA 86: 1 173), and self- 
sustained sequence replication (Guatelli et ai (1990) Proc, Nat. Acad. Sci. USA 87: 
1874), 

10 Detectioi : . of C. pneumoniae gene expression 

The nuclt;ic acids of the invention can also be used to C. pneumoniae 

detect gene transcripts. Methods of detecting and/or quantifying gene transcripts using 

nucleic acid hybridizati(>n techniques are known to those of skill in the art (see Sambrook 

et ai supra). For exam])le , a Northern transfer may be used for the detection of the 

15 desired mJRNA directly. In brief, the mRNA is isolated firom a given cell sample using, 
for example, an acid guanidinium-phenol-chloroform extraction method. The mRNA is 
then electrophoresed to separate the mRNA species and the mRNA is transferred from the 
gel to a nitrocellulose membrane. As with the Southern blots, labeled probes are used to 
identify and/or quantify the target mRNA. 

20 In another preferred embodiment, the gene transcript can be measured 

using amplification {e,g. PGR) based methods as described above for directly assessing 
copy number of the target sequences. 

Expression of C pneumoniae proteins 

The nucleic acids disclosed here can be used for recombinant expression 

25 of the proteins. In these methods, the nucleic acids encoding the proteins of interest are 
introduced into suitable host cells, followed by induction of the cells to produce large 
amounts of the protein. The invention relies on routine techniques in the field of 
recombinant genetics, well known to those of ordinary skill in the art. A basic text 
disclosing the general methods of use in this invention is Sambrook et aL, Molecular 

30 Cloning, A Laboratory Manual (2nd ed. 1989). 

Standard transfection methods are used to produce prokaryotic, 
mammalian, yeast or insect cell lines which express large quantities of the desired 
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polypeptide, which is then purified using standard techniques {see, e.g., Colley et aL, J. 
BioL Chem. 264:17619-17622, 1989; Guide lo Protein Purification, supra). 

The nucleotide sequences used to transfect the host cells can be modified 
to yield Chlamydia polypeptides with a variety of desired propenies. For example, the 
5 polypeptides can vary from the naturally-occurring sequence at the primary structure 

level by amino acid, insertions, substitutions, deletions, and the like. These modifications 
can be used in a number of combinations to produce the final modified protein chain. 

The amino acid sequence variants can be prepared with various objectives 
in mind, including facilitating purification and preparation of the recombinant 

10 polypeptide. The modified polypeptides are also useful for modifying plasma half life, 
improving therapeutic efficacy, and lessening the severity or occurrence of side effects 
during therapeutic use. The amino acid sequence variants are usually predetermined 
variants not found in nature but exhibit the same iminunogenic activity as naturally 
occurring protein. In general, modifications of the sequences encoding the polypeptides 

1 5 may be readily accomplished by a variety of well-known techniques, such as site-directed 
mutagenesis {see Gillman & Smith, Gene 8:81-97 (1979); Roberts et ai, Nature 328:731- 
734 (1987)). One of ordinary skill will appreciate that the effect of many mutations is 
difficult to predict. Thus, most modifications are evaluated by routine screening in a 
suitable assay for the desired characteristic. For instance, the effect of various 

20 modifications on the ability of the polypeptide to elicit a protective immune response can 
be easily determined using in vitro assays. For instance, the polypeptides can be tested 
for their ability to induce lymphoproliferation, T cell cytotoxicity, or cytokine production 
using standard techniques. 

The particular procedure used to introduce the genetic material into the 

25 host cell for expression of the polypeptide is not particularly critical Any of the well 
known procedures for introducing foreign nucleotide sequences into host cells may be 
used. These include the use of calcium phosphate transfection, spheroplasts, 
electroporation, liposomes, microinjection, plasmid vectors, viral vectors and any of the 
other well knovm methods for introducing cloned genomic DNA, cDNA, synthetic DNA 

30 or other foreign genetic material into a host cell (see Sambrook et al, supra). It is only 
necessary that the particular procedure utilized be capable of successfiilly introducing at 
least one gene into the host cell which is capable of expressing the gene. 
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Any of a number of well known cells and cell lines can be used to express 
the polypeptides of the invention. For instance, prokaryotic cells such as E. coli can be 
used. Eukaryotic cells include, yeast, Chinese hamster ovary (CHO) cells, COS cells, and 
insect cells. 

5 The particular vector used to transport the genetic information into the cell 

is also not particularly critical. Any of the conventional vectors used for expression of 
recombinant proteins in prokaryotic and eukaryotic cells may be used. Expression 
vectors for mammalian cells typically contain regulatory elements from eukaryotic 
viruses. 

1 0 The expression vector typically contains a transcription unit or expression 

cassette that contains all the elements required for the expression of the polypeptide DNA 
in the host cells. A typical expression cassette contains a promoter operably linked to the 
DNA sequence encoding a polypeptide and signals required for efficient polyadenylation 
of the transcript. The term "operably linked" as used herein refers to linkage of a 

1 5 promoter upstream from a DNA sequence such that the promoter mediates transcription 
of the DNA sequence. The promoter is preferably positioned about the same distance 
from the heterologous transcription start site as it is from the transcription start site in its 
natural setting. As is known in the art, however, some variation in this distance can be 
accommodated without loss of promoter function. 

20 Following the growth of the recombinant cells and expression of the 

polypeptide, the culture medium is harvested for purification of the secreted protein. The 
media are typically clarified by centrifugation or filtration to remove cells and cell debris 
and the proteins are concentrated by adsorption to any suitable resin or by use of 
ammonium sulfate fractionation, polyethylene glycol precipitation, or by ultrafiltration. 

25 Other routine means knovra in the art may be equally suitable. Further purification of the 
polypeptide can be accomplished by standard techniques, for example, affinity 
chromatography, ion exchange chromatography, sizing chromatography, His6 tagging and 
Ni-agarose chromatography (as described in Dobeli et aL, MoL andBiochem. Parasit. 
41 :259-268 (1990)), or other protein purification techniques to obtain homogeneity. The 

30 purified proteins are then used to produce pharmaceutical compositions, as described 
below. 

An altemative method of preparing recombinant polypeptides useful as 
vaccines involves the use of recombinant viruses (e.g., vaccinia). Vaccinia virus is grown 
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in suitable cultured mammalian cells such as the HeLa S3 spinner cells, as described by 
Mackett et al., in DNA cloning Vol. II: A practical approach, pp. 191-211 (Glover, ed.). 

Antibody Production 

The proteins of the present invention can be used to produce antibodies 
5 specifically reactive with C pneumoniae antigens. If isolated proteins are used, they may 
be recombinantly produced or isolated from Chlamydia cultures. Synthetic peptides 
made using the protein sequences may also be used. 

Methods of production of polyclonal antibodies are known to those of skill 
in the art. In brief, an immunogen, preferably a purified protein, is mixed with an 
10 adjuvant and animals are immunized. When appropriately high titers of antibody to the 
immunogen are obtained, blood is collected from the animal and antisera is prepared. 
Further fractionation of the antisera to enrich for antibodies reactive to Chlamydia 
proteins can be done if desired {see Harlow & Lane, Antibodies: A Laboratory Manual 
(1988)). 

15 Polyclonal antisera are used to identify and characterize Chlamydia in the 

tissues of patients using, for instance, in situ techniques and immunoperoxidase test 
procedures described in Anderson et ai JA VMA 198:241 (1991) and Barr et al Vet, 
PathoL 28:110-116 (1991). 

Monoclonal antibodies may be obtained by various techniques familiar to 

20 those skilled in the art. Briefly, spleen cells from an animal immunized with a desired 
antigen are immortalized, commonly by fusion with a myeloma cell {see Kohler & 
Milstein, Eur. J, Immunol. 6:51 1-519 (1976)). Alternative methods of immortalization 
include transformation with Epstein Barr Virus, oncogenes, or retroviruses, or other 
methods well known in the art. Colonies arising from single immortalized cells are 

25 screened for production of antibodies of the desired specificity and affinity for the 

antigen, and yield of the monoclonal antibodies produced by such cells may be enhanced 
by various techniques, including injection into the peritoneal cavity of a vertebrate host. 

Monoclonal antibodies produced in such a maimer are used, for instance, 
in ELIS A diagnostic tests, immunoperoxidase tests, immimohistochemical tests, for the in 

30 vitro evaluation of spirochete invasion, to select candidate antigens for vaccine 

development, protein isolation, and for screening genomic and cDNA libraries to select 
appropriate gene sequences. 
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Irnmunodiagonostic detection of C pneumoniae infections 

The present invention also provides methods for detecting the presence or 
absence of C pneumoniae, or antibodies reactive with it, in a biological sample. For 
instance, antibodies specifically reactive with Chlamydia can be detected using either 
5 Chlamydia proteins or the isolates described here. The proteins and isolates can also be 
used to raise specific antibodies (either monoclonal or polyclonal) to detect the antigen in 
a sample. In addition, the nucleic acids disclosed and claimed here can be used to detect 
Chlamydia'S^ZQ\f\c sequences using standard hybridization techniques. 

For a review of immunological and immunoassay procedures in general, 

10 see Basic and Clinical immunology (Stites & Terr ed., 7th ed. 1991)). The immunoassays 
of the present invention can be performed in any of several configurations, which are 
reviewed extensively in Enzyme Immunoassay (Maggio, ed., 1980); Tijssen, Laboratory 
Techniques in Biochemistry and Molecular Biology (1985)). For instance, the proteins 
and antibodies disclose 1 here are conveniently used in ELISA, immunoblot analysis and 

15 agglutination assays. 

In brief, immunoassays to measure anii-Chlamydia antibodies or antigens 
can be either competitive or noncompetitive binding assays. In competitive binding 
assays, the sample analyte (e.g., mii-Chlamydia antibodies) competes with a labeled 
analyte (e.g., mti-Chlamydia monoclonal antibody) for specific binding sites on a capture 

20 agent (e.g., isolated Chlamydia protein) bound to a solid surface. The concentration of 
labeled analyte bound to the capture agent is inversely proportional to the amount of free 
analyte present in the sample. 

Noncompetitive assays are typically sandwich assays, in which the sample 
analyte is bound between two analyte-specific binding reagents. One of the binding 

25 agents is used as a capture agent and is bound to a solid surface. The second binding 
agent is labelled and is used to measure or detect the resultant complex by visual or 
instrument means. 

A number of combinations of capture agent and labelled binding agent can 
be used. For instance, an isolated Chlamydia protein or culture can be used as the 
30 capture agent and labelled anti-human antibodies specific for the constant region of 

human antibodies can be used as the labelled binding agent. Goat, sheep and other non- 
l.uman antibodies specific for human immunoglobulin constant regions (e.g., y or |i) are 
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well known in the art. Alternatively, the anti-human antibodies can be the capture agent 
and the antigen can be labelled. 

Various components of the assay, including the antigen, ds\\\-Chlamydia 
antibody, or anti-human antibody, may be bound to a solid surface. Many methods for 
5 immobilizing biomolecules to a variety of solid surfaces are known in the art. For 
instance, the solid surface may be a membrane (e.g., nitrocellulose), a microtiter dish 
(e.g., PVC or polystyrene) or a bead. The desired component may be covalently bound or 
noncovalently attached through nonspecific bonding. 

Alternatively, the immunoassay may be carried out in liquid phase and a 
10 variety of separation methods may be employed to separate the boimd labeled component 
from the unbound labelled components. These methods are known to those of skill in the 
art and include immunoprecipitation, column chromatography, adsorption, addition of 
magnetizable particles coated with a binding agent and other similar procedures. 

An immunoassay may also be carried out in liquid phase without a 
15 separation procedure. Various homogeneous immunoassay methods are now being 
applied to immunoassays for protein analytes. In these methods, the binding of the 
binding agent to the analyte causes a change in the signal emitted by the label, so that 
binding may be measured without separating the bound from the unbound labelled 
component. 

20 Western blot (immunoblot) analysis can also be used to detect the presence 

of antibodies to Chlamydia in the sample. This technique is a reliable method for 
confirming the presence of antibodies against a particular protein in the sample. The 
technique generally comprises separating proteins by gel electrophoresis on the basis of 
molecular weight, transferring the separated proteins to a suitable solid support, (such as a 

25 nitrocellulose filter, a nylon filter, or derivatized nylon filter), and incubating the sample 
with the separated proteins. This causes specific target antibodies present in the sample 
to bind their respective proteins. Target antibodies are then detected using labeled anti- 
human antibodies. 

The immunoassay formats described above employ labelled assay 

30 components. The label may be coupled directly or indirectly to the desired component of 
the assay according to methods well knovm in the art. A wide variety of labels may be 
used. The component may be labelled by any one of several methods. Traditionally a 
radioactive label incorporating ^H, ^^^I, ^^S, '"^C, or ^^P was used. Non-radioactive labels 
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include ligands which bind to labelled antibodies, fluorophores, chemiluminescent agents, 
enzymes, and antibodies which can serve as specific binding pair members for a labelled 
ligand. The choice of label depends on sensitivity required, ease of conjugation with the 
compound, stability requirements, and available instrumentation. 
5 Enzymes of interest as labels will primarily be hydrolases, particularly 

phosphatases, esterases and glycosidases, or oxidoreductases, particularly peroxidases. 
Fluorescent compounds include fluorescein and its derivatives, rhodamine and its 
derivatives, dansyl, umbelliferone, etc. Chemiluminescent compounds include luciferin, 
and 2,3-dihydrophthalazinediones, e.g., luminol. For a review of various labelling or 

10 signal producing systems which may be used, see U.S. Patent No. 4,391,904, which is 
incorporated herein by reference. 

Non-radioactive labels are often attached by indirect means. Generally, a 
ligand molecule (e.g., biotin) is covalendy bound to the molecule. The ligand then binds 
to an anti-ligand (e.g., streptavidin) molecule which is either inherently detectable or 

15 covalently bound to a signal system, such as a detectable enzyme, a fluorescent 

compound, or a chemiluminescent compound. A number of ligands and anti-ligands can 
be used. Where a ligand has a natural anti-ligand, for example, biotin, thyroxine, and 
Cortisol, it can be used in conjunction with the labelled, naturally occurring anti-ligands. 
Alternatively, any haptenic or antigenic compound can be used in combination with an 

20 antibody. 

Some assay formats do not require the use of labelled components. For 
instance, agglutination assays can be used to detect the presence of the target antibodies. 
In this case, antigen-coated particles are agglutinated by samples comprising the target 
antibodies. In this format, none of the components need be labelled and the presence of 
25 the target antibody is detected by simple visual inspection. 

Pharmaceutical Compositions 

The peptides or antibodies (typically monoclonal antibodies) of the present 
invention and pharmaceutical compositions thereof are useful for administration to 
mammals, particularly humans, to treat and/or prevent Chlamydia infections. Suitable 
30 formulations are found in Remington's Pharmaceutical Sciences, Mack Publishing 
Company, Philadelphia, PA, 17th ed. (1985). 
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The immunogenic peptides or antibodies of the invention are administered 
prophylactically or to an individual already suffering from the disease. The peptide 
compositions are administered to a patient in an amount sufficient to elicit an effective 
immune response to Chlamydia. An effective immune response is one that inhibits 
5 infection. An amount adequate to accomplish this is defined as "therapeutically effective 
dose" or '*immunogenically effective dose." Amounts effective for this use will depend 
on, e.g., the peptide composition, the manner of administration, the stage and severity of 
the disease being treated, the weight and general state of health of the patient, and the 
judgment of the prescribing physician, but generally range for the initial immunization 

10 (that is for therapeutic or prophylactic administration) firom about 0.1 mg to about 1.0 mg 
per 70 kilogram patient, more commonly firom about 0.5 mg to about 0.75 mg per 70 kg 
of body weight. Boosting dosages are typically firom about 0.1 mg to about 0.5 mg of 
peptide using a boosting regimen over weeks to months depending upon the patient's 
response and condition. A suitable protocol would include injection at time 0, 4, 2, 6, 10 

15 and 14 weeks, followed by fiirther booster injections at 24 and 28 weeks. 

For therapeutic use, administration should begin at the first sign of 
infection. This is followed by boosting doses until at least symptoms are substantially 
abated and for a period thereafter. In some circumstances, loading doses followed by 
boosting doses may be required. The resulting immune response helps to cure or at least 

20 partially arrest symptoms and/or complications. Vaccine compositions containing the 
peptides are administered prophylactically to a patient susceptible to or otherwise at risk 
of the infection. 

The pharmaceutical compositions (containing either peptides or 
antibodies) are intended for parenteral or oral administration. Preferably, the 

25 pharmaceutical compositions are administered parenterally, e.g., subcutaneously, 
intradermally, or intramuscularly. Thus, the invention provides compositions for 
parenteral administration which comprise a solution of the immunogenic polypeptides 
dissolved or suspended in an acceptable carrier, preferably an aqueous carrier. A variety 
of aqueous carriers may be used, e.g., water, buffered water, 0.4% saline, 0.3% glycine, 

30 hyaluronic acid and the like. These compositions may be sterilized by conventional, well 
known sterilization techniques, or may be sterile filtered. The resulting aqueous solutions 
may be packaged for use as is, or lyophilized, the lyophilized preparation being combined 
with a sterile solution prior to administration. The compositions may contain 
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pharmaceutically acceptable auxiliary substances as required to approximate 
physiological conditions, such as buffenng agents, tonicity adjusting agents, wetting 
agents and the like, for example, sodium acetate, sodium lactate, sodium chloride, 
potassium chloride, calcium chloride, sorbitan monolaurate, triethanolamine oleate, etc. 
5 The compositions may also comprise carriers to enhance the immune 

response. Useful carriers are well known in the art, and include, e.g., KLH, 
thyroglobulin, albumins such as human serum albumin, tetanus toxoid, poiyamino acids 
such as poly(lysine:glutamic acid), influenza, hepatitis B virus core protein, hepatitis B 
virus recombinant vaccine and the like. 

10 For solid compositions, conventional nontoxic solid carriers may be used 

which include, for exanr.ple, pharmaceutical grades of mannitol, lactose, starch, 
magnesium stearate, sodium saccharin, talcum, cellulose, glucose, sucrose, magnesium 
carbonate, and the like. For oral administration, a pharmaceutically acceptable nontoxic 
composition is formed 1: y incorporating any of the normally employed excipients, such as 

15 those carriers previously listed, and generally 10-95% of active ingredient, that is, one or 
more peptides of the invention, and more preferably at a concentration of 25%-75%. 

As noted above, the peptide compositions are intended to induce an 
immune response to Chlamydia. Thus, compositions and methods of administration 
suitable for maximizing the immune response are preferred. For instance, peptides may 

20 be introduced into a host, including humans, linked to a carrier or as a homopolymer or 
heteropolymer of active peptide units from various Chlamydia proteins disclosed here. 
Altematively, a "cocktail" of polypeptides can be used. A mixture of more than one 
polypeptide has the advantage of increased immunological reaction and, where different 
peptides are used to make up the polymer, the additional ability to induce antibodies to a 

25 number of epitopes. 

The compositions also include an adjuvant. As used here, number of 
adjuvants are well known to one skilled in the art. Suitable adjuvants include incomplete 
Freund's adjuvant, alimi, aluminum phosphate, aluminum hydroxide, 
N-acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP), 

30 N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine (CGP 1 1637, referred to as nor-MDP), 
N-acetylmuramyl-Lalanyl-D-isoglutaminyl-L-aIanine-2-(r-2'-dipahnitoyl-sn- 
giycero-3-hydroxyphosphoryloxy)-ethylamine (CGP 19835A, referred to as MTP-PE), 
and RIBI, which contains three components extracted from bacteria, monophosphoryl 
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lipid A, trehalose dimycolate and cell wall skeleton (MPLh-TDM+CWS) in a 2% 
squalene/Tween 80 emulsion. The effectiveness of an adjuvant may be determined by 
measuring the amount of antibodies directed against the immunogenic peptide. 

The concentration of immunogenic peptides of the invention in the 
5 pharmaceutical formulations can vary widely, i.e. from less than about 0.1%, usually at 
or at least about 2% to as much as 20% to 50% or more by weight, and will be selected 
primarily by fluid volumes, viscosities, etc., in accordance with the particular mode of 
administration selected. 

The peptides of the invention can also be expressed by attenuated viral 

10 hosts, such as vaccinia or fowlpox. This approach involves the use of vaccinia virus as a 
vector to express nucleotide sequences that encode the peptides of the invention. Upon 
introduction into a host, the recombinant vaccinia virus expresses the immunogenic 
peptide, and thereby elicits an immune response. Vaccinia vectors and methods useful in 
immunization protocols are described in, e.g., U.S. Patent No. 4,722,848. Another vector 

15 is BCG (Baciile Calmette Guerin). BCG vectors are described in Stover et al. {Nature 
351:456-460 (1991)). A wide variety of other vectors useful for therapeutic 
administration or immunization of the peptides of the invention, e.g., Salmonella typhi 
vectors and the like, will be apparent to those skilled in the art from the description 
herein. 

20 The DNA encoding one or more of the peptides of the invention can also 

be administered to the patient. This approach is described, for instance, in Wolff et al, 
Science 247: 1465-1468 (1990) as well as U.S. Patent Nos. 5,580,859 and 5,589,466. 

In order to enhance serum half-life, the peptides may also be encapsulated, 
introduced into the lumen of liposomes, prepared as a colloid, or other conventional 

25 techniques may be employed which provide an extended serum half-life of the peptides. 
A variety of methods are available for preparing liposomes, as described in, e.g., Szoka et 
d\.Mnn, Rev, Biophys. Bioeng, 9:467 (1980), U.S. Pat. Nos. 4, 235,871, 4,501,728 and 
4,837,028. 



30 EXAMPLES 

The following examples are offered to illustrate, but no to limit the 
claimed invention. 
Example 1 : 
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This example describes comparison of the C. pneumoniae genome 
disclosed here and the, previously sequenced, C trachomatis genome (Stephens, et ai 
Science 282:754-759 (1998)). 

The apparent low level of DNA homology between C. trachomatis and C 
5 pneumoniae (Campbell, etal.J, Clin. MicrobioL 25:191 1-1916 (1987)) yet analogous 
cell structures and developmental cycles, predicts that comparative analysis of the two 
genomes will significantly enhance the understanding of both pathogens. Identification 
of genes that are present in one species but not the other are of particular importance for 
the mutually exclusive biological, virulence and pathogenesis capabilities of each. 

10 Identification of genes shared between the two species strongly supports the requirement 
for these capabilities in a biological system that has, over its long-term association with 
mammalian host cells, evolved to reduce the metabolic capacities while optimizing 
survival, growth and transmission of these unique pathogens. 

The previously sequenced C trachomatis genome contains 1,042,519 

15 nucleotides and 875 likely protein-coding genes. Similarity searching permitted the 
inferred functional assignment of sequences 636 (60%) genes disclosed here and 251 
(23%) are similar to hypothetical genes for other bacterial organisms including those for 
C trachomatis. The remaining 186 (17%) genes are not homologous to sequences 
deposited in GenBank.. Seventy C. trachomatis genes are not represented in the C 

20 pneumoniae genome. These are contained within blocks consisting of 2-17 genes and 19 
single genes. Of the 70 C. trachomatis genes without homologs in C. pneumoniae, 60 are 
classified as encoding hypothetical proteins. The remaining genes not represented in C. 
pneumoniae consist of the tryptophan operon (trpA.B.R), trpC, two predicted thiol 
protease genes, and 4 genes assigned to the phospholipase-D superfamily. 

25 It is evident that there is a high level of functional conservation between C 

pneumoniae and C trachomatis as orthologs to C trachomatis genes were identified for 
859 (80%) of the predicted coding sequences for C. pneumoniae. The level of similarity 
for individual encoded proteins spans a wide spectrum (22-95% amino acid identity) with 
an average of 62% amino acid identity between orthologs from the two species. The 

30 percent amino acid identity between orthologous chlamydial proteins is similar among 
functional groups with the highest for proteins associated with translation and the lowest 
for proteins whose function in chlamydiae is uncharacterized and not related to proteins 
encoded by other organisms. The gene order of the homologous set of genes in C 
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pneumoniae shows reorganization relative to the genome of C trachomatis^ however, 
there is a high level of synteny for the gene organization of the two genomes. We 
identified thirty-nine blocks of 2 or more genes whose gene organization is colinear with 
homologs to C trachomatis, although some of these are inverted. The distribution of 
5 genome reorganization is not evenly distributed on the chromosome as the region 
between C pneumoniae coding sequences 0130-0300 contains substantially more 
reorganization than other areas of the genome. This region coincides with the predicted 
chromosome replication terminus. 

We identified orthologs of enzymes characterized in other bacteria that 

10 account for the essential requirements for DNA replication, repair, transcription and 
translation including two predicted DNA helicases of the Swi2/Snf2 family found in C 
trachomatis. Similar to C. trachomatis, altemative sigma subunits for RNA polymerase, 
cr^S and <y^^, were identified in addition to anti-cr regulatory system factors RsbV, a 
RsbW-like single-domain histidine kinase, and a RsbU-like protein phosphatase. These 

15 findings suggest that the fundamental mechanisms of transcriptional regulation are 

conserved among Chlamydia, The C trachomatis proteins containing SET and SWEB 
domains, and a S WIB domain fused to the C-terminus of the chlamydial topoisomerase I, 
not identified outside eukaryotes, are found in C pneumoniae supporting their possible 
role in the chromatin condensation-decondensation characteristic of the biologically 

20 unique chlamydial developmental cycle. 

The central metabolic pathways inferred from the C pneumoniae genome 
sequence are the same as those identified for C trachomatis C. pneumoniae has a 
glycolytic pathway and a linked tricarboxylic acid cycle, although likely functional, is 
incomplete as genes for citrate synthase, aconitase, and isocitrate dehydrogenase were not 

25 identified. C. pneumoniae has a complete glycogen synthesis and degradation system 
supporting a role for glycogen synthesis and utilization of glucose-derivatives in 
chlamydial metabolism. Genes encoding essential fimctions in aerobic respiration are 
present and electron flux may be supported by pyruvate, succinate, glycerol-3-phosphate, 
and NADH dehydrogenases, NADH-ubiquinone oxidoreductase and cytochrome oxidase. 

30 C pneumoniae also contains the V (vacuolar)-type ATPase operon and the two ATP 
translocases found in C trachomatis. 

The type-in secretion virulence system required for invasion by several 
pathogenic bacteria and found in the C. trachomatis genome in three chromosomal 
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locationsis also present in the C pneumoniae genome. Each of the components is 
conserved and their relative genomic contexts are conserved. Genes such as a predicted 
serine/threonine protein kinase and other genes physically linked to genes encoding 
structural components of the type-Ill secretion apparatus, but without identified 
5 homologs, are also highly similar between the two species suggesting the functional roles 
in modifying cellular biology are fundamentally conserved. 

Chlamydia-cx\cQ(^t& proteins that are not found in chlamydial organisms 
but localized to the intracellular chlamydial inclusion membrane are likely essential for 
the imique intracellular biology and perhaps differences in inclusion morphology 
10 observed between species of Chlamydia. Several such proteins, termed IncA,B&C, have 

been characterized for a C. psittaci strain (Rockey, et al. MoL Microbiol. 15:617-626 
(1995); Rockey era/. Inf-ct. Immun. 62:106-112 (1994)). C. pneumoniae mAC. 
trachomatis encode orthc'logs to C psittaci IncB and IncC and C. trachomatis also 
contains an ortholog to I;icA. C. pneumoniae contains two genes that encode proteins 

15 with similarity to IncA (CPnOl86 and CPn0585), although the level of homology is low 
suggesting analogous but possibily altered functions. 

The tryptophan biosynthesis operon {trpA, trpB, trpR) and trpC identified 
in C. trachomatis is conspicuously missing in the C pneumoniae genome. This 
represents the entire repertoire of genes associated with tryptophan biosynthesis identified 

20 in C. trachomatis. Seventeen genes adjacent to the C. trachomatis tryptophan operon also 
were not found in the C. pneumoniae genome. This region is the single largest loss of a 
contiguous genomic segment and includes 4 HKD superfamily encoding genes that 
encompass a family of proteins related to endonuclease and phospholipase D. These 
findings may be important for the ability of Chlamydia to persist in their hosts and cause 

25 disease by eliciting potent, focal and persistent inflammatory responses thought to be 
essential for pathogenesis. 

The C pneumoniae genome contains 187,711 additional nucleotides 
compared to the C trachomatis genome, and the 214 coding sequences not found in C 
trachomatis account for most of the increased genome size. Eighty-eight of these genes 

30 are found in blocks of >10 genes (1 1-30 genes/block), 41 are single genes, and the 
remainder are partnered with at least one other gene. Based upon the observation that 
-70% of all the C pneumoniae genes have an identifiable homolog in GenBank, 
exclusive of C trachomatis, it would be expected that over 150 of the 214 genes should 
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have a homolog in GenBank, many associated with a function. However, only 28 coding 
sequences have similarity to genes from other organisms. Thus the majority of the genes 
that are mutually exclusive of C. trachomatis (186 of 214), and the 60 of 70 C 
trachomatis genes that lacked an identifiable homolog in C. pneumoniae, do not have 
5 detectable homologs to genes from other organisms. We predict that most of the unique 
genes are essential for specific attributes that define the differential biology, tropism and 
pathogenesis of C. trachomatis and C. pneumoniae. Moreover, this suggests that C 
pneumoniae has more unique biological (i.e., virulence) capacity than C trachomatis. 
The ability of C. pneumoniae to be more invasive and survive in a broader range of host 

10 cell types than C. trachomatis is consistent with this hypothesis. Not all of the 

differences in biological capacity may be associated with mutually exclusive genes. One 
explanation for the significantly lower level of homology between protein sequences 
assigned as having C pneumoniae and C. trachomatis orthologs but no identifiable 
orthologs in other organisms is that this set of proteins is not only associated with 

15 biological requirements specific for Chlamydia but this polymorphism may account for 
differential biology between the two species. The determination of the genome sequence 
from a representative of the C psittaci group will precisely delineate those genes that are 
mutually exclusive and specific for each species. 

The major ftmctionally identifiable addition to the C pneumoniae genome 

20 is a large expansion of genes encoding a new family of chlamydial polymorphic 

membrane proteins (Pmp), alone representing 22% of the increased coding capacity. 
While the C. trachomatis genome has 9 pmp genes, remarkably the C pneumoniae 
genome contains 21 pmp genes. Most of these genes appear to be amplified in two 
regions of the genome with three stand-alone genes. Interestingly one of the stand-alone 

25 genes is most closely related to the C trachomatis pmpD which is the only stand-alone 
pmp gene in the C. trachomatis genome and it is located with the same relative genomic 
context, suggesting an essential and conserved fimction for this paralog. Six Pmp-coding 
genes are presumably not fiinctional as five contain predicted coding frame-shifts and one 
is truncated. The amplification of this gene family and the confidently predicted frame- 

30 shifts suggest a specific molecular mechanism to promote fimctional or antigenic 

diversity. The biological role of this protein family remains enigmatic, although at least 
one of the proteins in C psittaci related to this family is exposed on the chlamydial 
surface. 
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While a function could not be assigned for mosc of the unique C 
pneumoniae genes, several have significant similarity to genes from other organisms. 
Functional assignments could be made for genes encoding GMP synthetase, IMP 
dehydrogenase, UMP synthase, uridine kinase, biotin synthase pathway proteins, 
5 methylthioadenosine nucleosidase, a DNA glycosylase and aromatic amino acid 
hydroxylase. Thus a complete pathway was identified for biotin biosynthesis. The 
additional purine and pyrimidine salvage pathway genes presumably reflect metabolic 
limitations in one of the cell types that C pneumoniae infects or differences in the ability 
of C. pneumoniae to transport precursor nucleosides or nucleotides. 

10 The addition of aromatic amino acid hydroxylase in C. pneumoniae is 

intriguing especially in light of the loss of tryptophan biosynthetic genes and the inability 
to synthesize other amino acids including phenylalanine. Aromatic amino acid 
hyroxlyases include three distinct enzymes that function to receptively oxidize 
phenylalanine to tyrosine, tyrosine to Dopa, and tryptophan to 5-hydroxytryptophan and 

15 serotonin. Although the chlamydial protein is similar to proteins of this family and 

incrementally more closely related to tryptophan hydroxylase, its specific function could 
not be confidently predicted. We hypothesize that it may be involved in C pneumoniae 
virulence. Tryptophan hydroxylase has not been previously identified in bacteria and the 
origin of the chlamydial gene appears to be from eukaryotes. The functional role of an 

20 aromatic amino acid hydroxylase for C pneumoniae is linked to the unique intracellular 
biology of this organism and may represent a key contribution to C pneumoniae 
persistence and pathogenesis. 

It is understood that the examples and embodiments described herein are 
for illustrative purposes only and that various modifications or changes in light thereof 

25 will be suggested to persons skilled in the art and are to be included within the spirit and 
purview of this application and scope of the appended claims. All publications, patents, 
and patent applications cited herein are hereby incorporated by reference in their entirety 
for all purposes. 

Table I provides functional assignments of C pneumoniae nonprotein- 
30 encoding genomic sequences. Table 2 provides functional assignments of protein coding 
sequences. Table 3 provides the amino acid sequences of the proteins corresponding to 
the coding sequences. 
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He tRNA 
Leu tRNA_l 
Leu tRNA_2 
Leu tRNA_3 
Leu tRNA_4 
Leu tRNA_5 

(R) Lys tRNA 
Pro tRNA_2 
(R) Pro tRNA^l 
Phe tRNA 
(R) Arg tRNA^2 
(R) Arg tRNA_3 
(R) Arg tRNA_4 
Arg tRNA_l 
Gin tRNA 
(R) Thr tRNA_3 
(R) Thr tRNA_l 
(R) Thr tRNA_2 
(R) Met tRNA_l 
(R) Met tRNA_2 
(R) Met tRNA_3 
Ser tRNA_l 
Ser tRNA_2 
Ser tRNA_3 
(R) Ser tRNA_4 
(R) Trp tRNA 
(R) Tyr tRNA 
(R) Val tRNA_l 
VaL tRNA_2 
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TABLE 2 









9r m^r' 




CPnOOOl 


282 


4 


R 


CTOOI nypocAaclcAl protein 


C?tl0002 


573 


875 


r 


7acC*Giu-ciUIX Gin Amidocrans£erafl« (C subunic) - (CT002 ) 


CPnOQ03 


895 


2370 


p 


gacA-Glu cfiNA Gin AjnidocransC«rae* (CT003) 




J 170 


: t : *. 


ir 


•/aiH •<?•': 1 1- J ''I't fi'rt,\ dtn Awi-ior rdn:::^rd3** O ;;ut)un Lr: ) - tCTOO-i » 




4127 






ptrtp^i-poi/morpnic Ouctfr Memord.nti Procum G Faotii/ 


CPnOOOfi 


7293 


7141 


ft 




f*0n0flrt7 


' OU 9 


L0496 






urnuuuo 


1 0Q7S 

XU 7 ' 3 


11685 








L XOX9 


13119 


f 






1 Ills 


14325 






V. rllU U X u 


1«J ' 7 


15746 




Crame-shifc wlclt 0010 


\m rnu U X X 


X9a7* 


16614 


p 




^ rnu u X « 


16644 


18212 


p 




v» mw w X J 


18584 


21106 


F 


ptnp_2*Poiyniorphic Oucer Hambrane Procein G Faiaily 




21392 


21922 




psip_3-Polynarphic Oucer Membrane Procein G Family 




21335 


24174 


y 


pirp_3-PMP_3 t frame -shifc wich 0014) 


mw u X o 


24416 


26188 


I' 


pmp^4'-PolyBiorphic Oucer Membrane Procein G Family 


fPnOfll? 


26094 


27170 


}' 


pmp_4-PMP.4 (frame-shifc wich 0016) 


wrnwwxo 


27522 


29003 


p 


pmp_5-Polyoorpbic Oucer Membrane Procein C Family 


rnu u X 7 


29007 


30356 


p 


pmp_5-PMP_5 (fraaa-ahift wich. 0018} 


wrnuu^u 




30603 




PTmS±cc»<i. OHP flAAdar (14) oaocidA* oucer mambrane] - {CT351) 




34410 


32707 


n 


Predlcced OMP rieadar (19) D«DCidel - CCT350) 




J470* 


34395 




ma£- (eT3491 


Crnuu2 3 


Joou J 




B 

r 


yj J 1^/ a^r~Aix> a * fUTpg.fc y^atwjti* Airaw 


CPn0024 


37596 


36661 


F 


xerC'Inteffrase/recombinase- (CT347) 




J no w 


37684 




ol r «Jl*?Ti InhahvdpolAfl A/Glvcoatil facaaa'* I CT346 ) 


V* rZlU U « o 


J 70« 3 


38762 








42234 


39778 


ft 


lon~£«on ATP~dependenc Procease* (CT344) 




43325 


42543 






V* JruWU* 7 


43755 


43390 


ft 






43891 


44529 


P 


yep l^O'Sialoylycoprocein Sndopcpcidaso^l'* {CT343) 




44711 


44884 


p 


rs21-S21 Ribosomal Procein- CCT342) 


CPn003 2 


44923 


46098 


p 


dnaJ»Heac Shoclc Procein J-CCT341) 


CPn0033 


46138 


48171 


p^ 


pdhA&B/cdbA&odbB- (pyruvace) Oxoisovaleraco Dehydrogenase M.pba & 
Fuaioa- (CT3 40 J 


CPn0034 


49457 


48210 


ft 




CPn0035 


51029 


49569 


ft 


CT339 hypochecical procein 


CPa0036 


51002 


51796 


p 


CT338 hypochecical procein 


CPnO037 


51792 


52115 


r 


pcsH-PTS Phosphocarrier Procein Hpr-(CT337) 


CPn0038 


52119 


53831 


p 


pcsI-FT5 PEP Phosphocransf erase** {CT336) 


C?Tl0039 


54250 


53963 


ft 


ybaB- (CT335) 


CPn0040 


55643 


54318 


ft 


dftaX_l-DNA Pol III Gamma and Tau_l-{CT334) 


CPn0041 


55996 


57342 


p 




CPn0042 


57403 


58182 


p 




CPn0043 


58447 


60372 


p 




CPnQ044 


60419 


60778 


p 




CPn004S 


61069 


62790 


P 




CPa0046 


62790 


63263 


p 




CPn0047 


63455 


63652 


P 




CPn0048 


63687 


65801 


p 


*yqfF-Bs conserved hypochecical IM procein 


CPna049 


66296 


65817 


ft 




CPnOOSO 


66613 


66499 


ft 




CPnQQSl 


66833 


67111 


p 




CPn0052 


68005 


67304 


ft 


heme- Porphobilinogen Deaminase- (CT299) 


CPnOOS3 


69344 


67986 


ft 


sms-Sms Proctin* (CT298) 


CFn0054 


70023 


69313 


ft 


mc-ftibonuciease ZZr'(CT297) 


CPnOOSS 


70129 


70590 


F 


CT296 hypochecical procein 


CPn0056 


70953 


72746 


P 


, mrsA-Phosphomannomucase- {CT29S ) 


CPn0057 


72934 


73554 


? 


sodM-Superoxide Oismucase (Kn)-{CT294) 


CPnOO?8 


73639 


74562 




accD-AcCoA Carboxylase /Transferase BeCA-(CT293) 


CPn0059 


:t6l6 


75050 


F 


due -dUTP Nucieoc idohydroiase- (CT292 ) 


CPnOOfiO 


75055 


75528 


r 


pcsN_l-PTS IIA Procein- (CT29L) 


CPn006L 


75534 


76208 


r 


ptsN_2-PTS tIA Procein HTK DMA-Binding Domain- (CT2 90) 


CPnOO«2 


76}0d 


77690 


F 


CT-89 hypocnecic/iL procein 


CPnOOfi ) 


78Li: 


78267 


F 




CPn00<4 


7HJ46 


78576 


F 




TPnOn^S 


7H9::4 


406SL 


F 


CT2B8 hypocfxer. i.r.il pcor.»iin 


CPi\fl066 


H09J5 


rt2655 


F 
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CPnO067 


82953 


84053 


F 


CPnOUDo 


84903 


84331 


R 


^ A A f n 

cpnooo? 


8 523 6 


87086 


F 


CPnOu/u 


87378 


87208 


R 


CPn007 1 


8804 5 


87599 


R 


CPn0072 


8906 1 


88057 


R 


^ n A A rt ^ ^ 

CPnuO/j 


89356 


89574 


F 


CPn0074 


8 9774 


90955 


F 


CPn0075 


91102 


91350 


F 


CPn0076 


91358 


91903 




CPn0077 


92013 


92435 


F 


CPn0078 


92465 


93160 


F 


CPn0079 


93179 


93688 


F 


CPnOOSO 


93735 


94121 


F 


CPnOOai 


94261 


98016 


F 


CPn0082 


98043 


102221 


F 


CPn0063 


102332 


103312 


F 


CPn0064 


103362 


103751 


F 


CPn0085 


104S06 


103766 


R 


CPn0086 


104904 


105527 


F 


CPn0087 


105579 


106376 


F 


CPn0088 


106373 


108145 


F 


CPn0089 


108153 


109466 


F 


pPn0090 


109454 


110080 


F 


CPn0091 


110074 


112053 


F 


CPn0O92 


112151 


112573 


r 


CPn0093 


112509 


113015 


F 


CPn0094 


113152 


115971 


F 


CPn0095 


116037 


118790 


F 


CPn0096 


124314 


118837 


R 


CPn0097 


124555 


126006 


F 


CPn0O98 


127491 


126091 


R 


CPn0099 


127593 


127865 


F 


CPnOlOu 


129141 


127882 


R 


CPnOlOl 


129932 


129141 


R 


^^^^ A V A *^ 


130123 


131466 


F 


cpnoiuj 


131480 


132511 


r 


cpnoxo4 


133875 


132676 


R ■ 


cpnoxOs 


134847 


134029 


R 


^ fWh A ^ A ^ 

cpnoiOo 


135091 


136374 


F 


^n*>A'^ n *i 


137162 


136392 


R 


i~" Ti A 1 AO 

CPnOlOo 


137857 


137303 


R 


A V A n 


138655 


141783 


F 


v*rnoxxu 


143734 


141827 


R 


CPnOXlX 


144686 


143934 


R 


CPn01I2 


144767 


145093 


F 


CPnOX13 


145335 


146405 


F 




1463 98 


147261 


F 




147279 


148622 


F 


^X^UXl D 


148616 


148972 


F 




^ it A A A A 

148969 


150071 


F 


PDnm 1 Q 


150102 


150464 


- p 


CPnftI 1 Q 
wrau^x If 


150523 


1S1164 


F 


^ fTlU X * U 


X511d4 


151778 


F 




151778 


152068 


F 


CPn0122 


1 mi 

X3^U / 1 


153723 


F 


CPn0123 




153774 


R 


CPnOX24 


X30 0X4 


158068 


F 


Cpn0125 


X 30 U 79 


158605 


F 


CPn012S 


X 0 U 7 


161085 


F 


CPn0127 


J» 0 « X ^ ^ 


161130 


R 


CPnOX28 


X ^ f / 


163053 


F 


CPn0129 


X 0 J / X / 


163064 


R 


CPn0130 


1 7^ C 
X D% A 4 J 


163751 


R 


CPnOX31 


X 0 4 3 4 7 


165560 


F 


CPn0132 


X 03 3 O ' 


166561 


F 


CPnOl33 


167334 


XOQ 3 o4 


R 


CPn0134 


169098 


167467 


R 


CI>n0135 


169448 


169143 


R 


CPn0136 


171401 


169569 


R 


CPn0137 


172254 


171502 


R 


CPnOl38 


174019 


172700 


R 



CT360 hypothecicai protein 



CT325 hypothetical protein 

CT324 hypothetical protein 

inf A-Initiation Factor IF-1-(CT323) 

tuf A-ElongACion Factor Tu-(CT322) 

secE-preprotein translocase- {CT321) 

nusG-Transcriptional Anci termination- fCT320) 

rill-Lll Ribosomal Protein- (CT3 19) 

rll-Ll Ribosomal Protein- (CT318 ) 

rllO-LlO Ribosomal Protein- (CT3 17) 

rl7-L7/L12 Ribosomal ProCein- (CT316) 
rpoB-RKA Polymerase Beca-{CT315) 
rpoC-RNA Polymerase Beta' -tCT314) 
tal-Transaldolase- (CT313 ) 
predicted ferredoxin- (CT312) 
CT311 hypothetical protein 
atpE-ATP Synthase Subunit E-{CT310) 
CT309 hypothetical protein 
atpA-ATP Synthase Subunit A-(CT308) 
atpB-ATP Synthase Subunit B-(CT307} 
atpD-ATP Synthase Subunit D-(CT306) 
atpI-ATP Synthase Subunit I-tCT305) 
atpK-ATP Synthase Subuxiic K-(CT304) 
CT3 03 hypothetical protein 
valS-Valyl tRNA Synthetase- (CT302 ) 
ptaiD-S/T Protein Kinase- {CT301) 
uvxA-Excinuclease ABC Subunit A-{CT333) 
pyk-Pymvate Kinase- (CT332) 
htrB-Acyl transferase- (CTOlO) 

CTOll hypothetical protein 

ybbp family hypothetical protein- (CT012 ) 

cydA-Cytochrome Oxidase Subunit I-(CT013) 

cydB -Cytochrome Oxidase Subunit II-(CT014) 

CT017 hypothetical protein 

CT016 hypothetical protein 

phoH-ATPase-{CT015) 

CT058 hypothetical protein^l 

CT018 

ileS-Isoleucyl-tRNA Synthetase- (CT019) 

lepB-Signal Peptidase I-{CT020) 

CT021 hypothetical protein 

rl31-L31 Ribosomal Protein- (CT022 ) 

pfrA-Peptide Chain Releasing Factor fRF-1) - (CT023 ) 

hemK-A/G specific methylase- (CT024) 

ffh-Signal Recognition Particle CTPase- {CT025) 

rsl6-S16 Ribosomal Protein- (CT026) 

trmD-tRNA (guanine N-1) -Methylcransf erase- (CT027) 

rll9-L19 Ribosomal Protein- CCT028 1 

mhB.l-Ribonuclease HII.l- {CT029) 

gmk-GMP Kinase- (Cr030) 

CT031 hypothetical protein 

metG-Methionyl-tRNA Synthetase- (CT032) 

recD.l-Exodeoxyribonuclease V (Alpha Subunit)_l- (CT033 ) 



ycfF-Cationic Amino Acid Transporter- <CT03 4 ) 
bpll-Biotin Protein Ligase- (CT035 ) 
similarity to CT036 



CHLPS hypothetical protein- {CT109 ) 
groEL_l-HSP-60_l- (CTllO) 
groES-lOKDa Chaperonin- (CTlll ) 
pepF-Oligopeptidase- (CT112) 
ybgl-ACR family- (CT108) 

hemL-Clutajnate-l- semi aldehyde -2 . 1 -aminomutase- {Cr210) 
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CPn013 9 


I746S6 


174093 


R 


yqgE-(CT210l 


CPn0l40 


175XX0 


174673 


R 


y<jdr-(CT2121 


CPnOi4i 


175802 


175X10 


R 


rplA-Ribos**S-P Isom«r*s« A-(CT2X3) 


CPn0142 


176091 


175816 


R 




CPnOX43 


177335 


176214 


R 


•yxjC_Bfi_l Hypoth«ciCAi Protttia 


CPnOl44 


177963 


180560 


F 


clpB-clp Proceasa ATPase- (CT113 ) 


CPn0145 


180777 


182369 


F 


CT114 hypocheclcAi procftln 


CPn0146 


1826X3 


183095 


F 




CPnOl47 


183225 


183671 


r 




CPn014e 


183846 


185702 


F 


picnl-S/T Protein Kinase- (CT145) 


CPn0149 


185715 


187700 


- 


dnlJ-DNA Li9ase-(CTX46) 


CPaOlso 


187834 


192444 


F 


CTX47 hypochecical protftin 


CPAOISI 


194X42 


192625 


R 


mhpA-Monooxy9«nase- {CTX46 ) 


C?nOIS2 


195265 


1943X8 


R 


C7149 hypochecical procvin 


CPn01S3 


195433 


197892 


F 


leuS-Leucyl CRNA Synchecase- (CT209} 


CPn01S4 


197892 


199202 


F 


9saA-KDO Transferase* (CT2 08) 


cpnoiss 


199691 


199488 


R 




CPn01S6 


200XX7 


199770 


R 




CPn0157 


200723 


200298 


R 




cpxfOisa 


201430 


200894 


R 




C?a0159 


201772 


20X467 


R 




CPn0160 


203791 


202X27 


R 


pf )cA_l - Fructose*6-P Phosphotransf erase_l - XCT207 > 


CPnOiei 


204622 


203798 


R 


predicted acyl transferase family- (CT206) 


CPn01£2 


205828 


204803 


R 




CPn0163 


206026 


206394 


T 




CPn0164 


206496 


206998 






CPn0155 


206998 


207582 


? 




CPn0166 


207630 


207962 


r 




CPn0167 


208306 


207977 


R 




CPn0166 


208641 


2084X7 


R 




CPn0169 


209501 


2087X0 


R 




CPnOlTO 


21X026 


2X0025 


R 




CPn017l 


2X2435 


2XXX49 


R 


*guaA-GMP Synthase 


CPn0172 


2X3X77 


2X2440 


R 


*Cruafi/i2npO-Inosine 5 ' -nonophosphase dehydrogenase (COOH- 










ottXy) 


CPn0173 


2X3987 


2X37X5 


R 




CPn0174 


2X4257 


2X4724 


F 




CPn0175 


2X4896 


2X5275 


F' 




C?a0176 


2X5286 


2X65X8 


F 


CTX53 hypothetical protein 


CPn0177 


2X7459 


2X6608 


R 




CPn0178 


2X8052 


2X7789 


R 




CPn0179 


2X8403 


218056 


R 




CPnOlso 


2X8851 


2X8355 


R 




CPnOiei 


219X75 


2X8777 


R 




CPn0182 


220695 


219334 


R 


accC-Biotin Carboxylase- (CTX2 4 ) 


CPn0183 


221X95 


220695 


R 


accB-Biotin CArboxyl Carrier Protein- fCTX23) 


CPa0164 


221775 


221221 


R 


•fP-l-Elongation Factor P.1-{CTX22) 


CFnOXes 


222451 


22X765 


R 


rpe/ araO*Ribulose-P Epimerase- (CTX2X) 


CPn0186 


222899 


224068 


F 


^simiXarity to Cps ZncX.l-(CTXX9} 


CPn0187 


234248 


225045 


r 


predicted methylase-{CTX33) 


CPn0188 


225XX1 


226400 


F 


CTX32 hypothetxcaX proc«tn 


CPnOX89 


226400 


229825 


■ F 


CTX3X homolo9-< Possible Transmembrane Protein) 


CPn0190 


2299X9 


23X274 


T 




CFnOI9l 


23X991 


23X314 


R 


9XnO-ABC Amino Acid Transporter ATPase-(CTX30) 


CPnOX92 


232634 


23X984 


R 


glnP-ABC Amino Acid Transporter PezBease-{CTX29) 


CPttOI93 


233X26 


232686 


R 


*ar9R-Aroinine Repressor 


CPI10I94 


233210 


234241 


F 


gcp.2 -0-5ialogXycoprotein £ndopeptidase.2 - ( CTX97 } 


CPn0195 


234X90 


235785 


F 


oppA^l -Oligopeptide Binding Procein^l 


CPn0196 


235939 


237519 


F 


oppA»2 -Oligopeptide Binding Protein_2-(CrX98) 


CPnOI97 


237578 


238862 


F 


oppA^3 -Oligopeptide Binding Proceixx.3 


CPn0198 


239X69 


240746 


r 


opp\.4 -Oligopeptide Binding Procein.4 


CPnOX99 


24X042 


24X983 


F 


opp6_l -Oligopeptide Pemease.l- (CTX99) 


CPa0200 


242017 


242668 


F 


oppC.l -Oligopeptide Peznease.l- (CT200) 


CPn0201 


242064 


243715 


F 


oppO-Oligopeptide Transport A7PBse-(CT201) 


CPn0202 


243715 


244500 


F 


oppF-Oligopeptide Transport ATPase- (CT202) 


CPn0203- 


245008 


245802 


r 




CPnO204 


245817 


246002 


F 




CPn0205 


246X33 


246327 


F 




CPnO206 


246409 


247X61 


F 


CT203 hypothetical protein 


CPnO207 


247208 


248617 


F 


ybhX/sodiTl*Oxoglutarace/Malate Transloeator- {CT204) 


CPn0208 


248953 


250602 


F 


pf)cA^2-Fructose-6-P pnosphotransf erase.2- (CT205) 


CPnO209 


251036 


251272 


F 
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CPn0210 


252384 


251440 


R 


CPn02il 


252756 


252463 


R 


CPn0212 


254066 


252888 


R 


CPn0213 


254342 


254190 


R 


CPn0214 


255657 


254446 


R 


CPn0215 


257015 


2S5759 


R 


CPn0216 


257608 


257174 


R 


CPn0217 


257896 


258579 


F 


CPn02X8 


259058 


258582 


R 


CPn0219 


259357 


260472 


F 


C?n0220 


260696 


261238 


F 


CPn022X 


261657 


262064 


F 


CPn0222 


262504 


262842 


F 


CPn0223 


262956 


263333 


F 


CPn0224 


263435 


263674 


F 


CPn0225 


263873 


264541 


F 


CPn0226 


264566 


264967 


F 


CPn0227 


265416 


265009 


R 


CPn0228 


266110 


265412 


R 


CPn0229 


266328 


267560 


F 


CPn0230 


268253 


267576 


R 


CPn0231 


268957 


268253 


R 


CPn0232 


270122 


269232 


R 



CPn0233 


270424 


270248 


R 


CPn0234 


271240 


270548 


R 


CPn0235 


271416 


272177 


F 


CPn0236 


272156 


273766 


F 


CPn0237 


273762 


274214 


F 


CPn0238 


274303 


275838 


F 


CPn0239 


275899 


276672 


F 


CPn0240 


277861 


276698 


R 


CPn0241 


279354 


278203 


R 


CPn0242 


279918 


279487 


R 


CPn0243 


280555 


280133 


R 


CPn0244 


280918 


281556 


F 


CPn0245 


281645 


282499 


F 


CPn0246 


282952 


282551 


R* 


CPn0247 


283415 


282969 


R 


CPa0248 


284327 


283650 


R 


CPn0249 


2B5841 


284333 


R 


CFI10250 


286057 


285902 


R 






287559 


F 


CPn0252 


288112 


287576 


R 


CPn0253 


288456 


287950 


R 


CPn0254 


289262 


288459 


R 


CPn0255 


290165 


289329 


R 


CPn02S6 


291264 


290398 


R 


CPn0257 


292127 


291267 


R 


CPn02S8 


292534 


292133 


R 


CPn0259 


292966 


292441 


R 


CPnO260 


294045 


293548 


* R 


CPn0261 


294302 


295033 


F 


CPn0262 


295091 


295933 


F 


CPn0263 


296249 


297136 


F 


CPn0264 


297730 


297155 


R 


CPn0265 


298620 


297730 


R 


CPn0266 


299184 


299876 


F 


CPn0267 


300122 


300910 


F 


C?n0268 


300935 


301318 


F 


CPn0269 


302450 


301476 


R 


CPn0270 


303325 


302468 


R 


CPn0271 


303634 


304362 


F 


CPn0272 


305233 


304340 


R 


CPn0273 


305844 


305227 


R 


CPn0274 


308353 


305852 


R 


CPn0275 


310786 


308372 


R 


CPn0276 


311137 


310793 


R 


CPn0277 


311910 


311404 


R 


CPn0278 


312875 


312060 


R 


CPn0279 


313537 


312875 


R 


CPn0280 


314572 


313550 


R 



ypdP- {CT140) 

tgt-Oueuine tRNA Ribosyl Transferase- {CT193 1 



•weak similarity co Bacteriophage CKPl (Orf4> 



dsbB-Disulfide bond Oxidoreduccase- (CT176) 
dsbG-Disulfide Bond Chaperone- CCT177) 
CT178 hypothetical protein 
CT179 hypothetical protein 

CauB-ABC Transport ATPase (Nitrate/Fe) - (CTISO ) 
•similarity to 5 * -Methyl thioadenosxne / S-Adenosylhomocysteine 
Nucleosidase 

CT181 hypothetical protein 

kdsB-deoxyoctulonoEic Acid Synthetase- (CTl 82) 
pyrG-CTP Synthetase- (CT183) 
yggF Family- (CT18 4) 

rwf -Glucose-6-P Dehyrogenase- {CT1B5J 
devB-Glucose-6-P Dehyrogenase (DevB f amily) - (CT186) 



adJc-Adenylate Kinase- (CT128) 

ydhO- Polysaccharide Hydrolase-Invasin Repeat Family- {CT127) 

rs9-S9 Ribosomal Protein- (CT12 6) 

rll3-L13 Ribosomal Protein- (CT125) 

ycfV/ybbA-ABC Transporter ATPase- (CTl 52 ) 

CT151 hypothetical protein 

rl33-L33 Ribosomal Protein- (CT150) 

•conserved hypothetical protein 

CT144 hypothetical protein (frame-shift with 0253?) 

CTl 4 4 hypothetical protein_l 

CT143 hypothetical protein^l 

CT142 hypothetical proteia_l 

CT144 hypothetical protein».2 

CTl 4 3 hypothetical protein^2 

CT14 2 hypothetical protein (frame-shift with 0259?) 
CT142 hypothetical protein^2 
sec^l-Protein Translocase Subunit_l-{CT141) 
ydaO-PP-Loop Superfamily ATPase- (CT2 17) 
surE-SurE-li)ce Acid Phosphatase- {CT218) 
yqfU hypothetical protein- (CT221) 
ubiD-Phenylacrylate Decarboxylase- (CT220) 
ubiA-Benzoate Octaphenyltransf erase- {CT219) 



Dipeptidase- {CT138 ) 

ywlC-SuA5 Superfamily-related Protein- (CT137 ) 
Lysophospholipase esterase- {CT136) 
dnaX_2-DNA Pol III Gamma and Tau_2- (CT187) 
tdk-Thymidylate Kinase- {CT188) 
gyrA_l-DNA Gyrase Subunit A_1-(CT189) 
CyrB_l-DNA Gyrase Subunit B_1-(CT190) 
CTl 91 hypothetical protein 

•conserved outer membrane lipoprotein protein 
•Possible ABC Transporter Permease Protein 
dppF-Dipeptide Tran«pcrter ATPase- (CT689 ) 
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CPn028X 

CPn02B2 

CPn02S3 

CPn0284 

CPn0285 

CPn0286 

CPn0287 

CPn0288 

CPn0289 

CPn0290 

CPn0291 

CPn0292 

CPn0293 

CPn0294 

CPI10295 

CPn0296 

CPn0297 

CPti0298 

CPn0299 

CPn0300 

CPn0301 

CPn0302 

CPn0303 

CPn0304 

CPn0305 

CPn0306 

CPn0307 

CPn0308 

CPn0309 

CPn0310 

CPn0311 

CPn0312 

CPn03l3 

CPn0314 

CPn03I5 

CPn03I6 

CPn0317 

CPn0318 

CPn0319 

CPn0320 

CPn032l 

CPn0322 

CI>n0323 

CPn0324 

CPn0325 

CPn0326 

CPn0327 

CPn0328 

CPn0329 

CPn0330 

CPn0331 

CPn0332 

CPn0333 

CPn0334 

CPI10335 

CPn0336 

CPn0337 

CPn0336 

CPn0339 

CPn0340 

CPn0341 

CPn0342 

CPn0343 

CPn0344 

CPn0345 

CPn0346 

CPn0347 

CPn0348 

CPn0349 

CPn0350 

CPn0351 



315057 

3I612S 

318497 

3X9045 

320595 

322059 

324221 

325716 

325812 

327042 

328667 

329228 

329949 

333092 

333863 

334765 

335697 

336721 

336616 

337783 

340250 

340787 

342958 

343133 

344154 

345145 

348986 

349234 

350974 

353433 

354438 

354524 

354990 

356285 

356977 

358820 

360081 

362767 

363175 

363860 

365858 

366249 

367331 

369492 

370708 

371148 

372945 

373241 

375088 

376675 

378437 

378655 

379090 

379311 

379817 

380650 

382027 

38227B 

383420 

383842 

384160 

384622 

: 84999 

387420 

388572 

389675 

391021 

391803 

392770 

393181 

3938B8 



316103 

317529 
317532 
318551 
319051 
320650 
322089 
324571 
326996 
328523 
329194 
329836 
332723 
333502 
333627 
334022 
334774 
335717 
337415 
340152 
340762 
341866 
341921 
344158 
345137 
346431 
346515 
349596 
349595 
351049 
353575 
354976 
355355 
355353 
358716 
360121 
362750 
363126 
363679 
364783 
364767 
367328 
369460 
370688 
371148 
372725 
373211 
374992 
376146 
376202 
376701 
378536 
378800 
379823 
380674 
381591 
381575 
383375 
384034 
'384156 
384495 
385062 
385595 
385558 
387436 
388704 
389678 
391027 
391790 
393684 
395432 



F 
R 
R 
R 
R 
R 

R 
r 

F 
F 
F 

F 

R 

:\ 
n 
r 
I 

V 
V 

1! 
I 
I 
I 

I. 

r 

R 
R 

R 

F 

F 

R 

F 

F 

F* 

F 

F 

F 

R 

F 

F 

F 

F 

F 

F 

F 

F 

R 

R 

R 

R 
F 
F 
F 
R 
F 
F 
F 
F 
F 
F 
R 
R 
R 
R 
R 
R 
F 



dhnA-Predicted i.6-Fruccose flipfioBph4.,e Aldolaae (d^hvdxin family)- 

(CT2:5) 

xasA/gadC-Amino Acid Transporter- (CT21 6 ) 



mgcE-Mg** Transporter (CBS Domain) - (CT194) 
CT195 hypothecical protein 

aaaT-Neutrai Amino Acid (Glucaxnatej Transporter- (cr230) 

Na-dependenc Transporter- (CT23 1 ) 

incfl- Inclusion Menvbrane Protein B-(CT232) 

incc- Inclusion Membrane Protein C-{CT233) 

CT234 hypothetical protein 

cAMP-Dependcnt Protein Kinase Regnilatory Subunit-{CT235) 

acpP-Acyl Carrier Protein- (CT23 6) 

fabG-Oxoacyl (Carrier Protein) Reductase- (CT23 7) 

fabD-Malonyl Acyl Carrier Trans cy clas e- {CT238) 

fabH-Oxoacyl Carrier Protein Synthase III-(CT239) 

recR-Recombination Protein- {CT240) 

yaeT-Ompas Analog- {CT2 41) 

(OrqsH-Like Outer Membrane Protein) - (CT242) 

IpxO-UDP Glucosamine N-Acyltransferase- (CT243 ) 

CT244 hypothetical protein 

pdhA/odpA- pyruvate Dehydrogenase Alpha- (CT245) 
pdhfl/odpB- pyruvate Dehydrogenase Beta-(CT246) 
pdhC-Dihydrolipoamide Acetyl transferase- (CT247) 
glgP-Glycogen Phosphorylase- (CT248) 
similarity to CT249 

dnaA_l -Replication Initiation Protein_l- (CT250) 

60lM-60kDa Inner Membrane Protein- (CT251) 

Igt-Prolipoprotein Diacylglycerol Trans f erase- (CT25 2) 

CTlOl hypothetical protein 

acpS-Acyl-carrier Protein Synthase- (CTIOO) 

trxB-Thioredoxin Reductase- {CT099 ) 

rsl-Sl Rihosomai Protein- {CT09 8) 

nusA-N Utilization Protein A-(CT097) 

infS-rnitiation Factor-2- (CT096) 

rb£A-Ribosome Binding Factor A-{CT095) 

truB-tRNA Pseudouridine Synthase- (CT09 4 ) 

ribF-FAD Synthase- (CT093) 

ychF-CTP Binding Protein- (CT092) 

yscU-YopS Translocation Protein U -(CT091) 

IcrD- Low Calcium Response D-(CT090) 

IcrE- Low Calcium Response E-(CT089) 

sycE-Secrecion Chaperone- (CTOSB) 

malQ-Clucanotransf erase- {CT087 ) 

rl2 8 -L2 8 Rihosomai Protein- {Crr086) 

CT085 hypothetical protein 

Phopholipase D Superfamily [leader (33) peptide) - (CTO 84) 

CT083 hypothetical protein 

CT082 hypothetical protein 

CHLTR T2 Protein- (CTOei) 

ltuB-(CT080) 

CT079 similarity 

folD-Methylene Tetrahydro folate Dehydrogenase- (CT078) 
yojL-(CT077) 

smpB- Small Protein B-(CT076) 
dnaN-DNA Pol III (beta chain) - (CT075) 
recF-ABC superfamily ATPase- (CT074) 
(frame-shift with 0339) 
(frame-shift with 0340) 

predicted OMP (leader (19) peptide) - (Cr073) 
(frame-shift with 0342?) 
yaeL-Metailoprotease- (CT072) 
yaeM-(CT071) 

troD/ytgD- Integral Membrane Protein- (CT070) 
troc/ytgc-lntegral Membrane Protein- (CT069) 
troB/ytgfl-ABC transporter ATPase- (CT068 ) 
troA/ytgA-Solute Protein Binding Family- (CTO 67) 
CT066 hypothetical protein 
adt.l -ADP/ATP Trans locase_l - (CT065 ) 
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CPn0352 

CPn0353 

CPn0354 

CPn0355 

CPn0356 

CPn0357 

CPn035B 

CPn0359 

CPn0360 

CPn0361 

CPn0362 

CPn0363 

CPn0364 

CPn0365 

CPn0366 

CPn0367 

CPn0368 

CPn0369 

CPn0370 

CPn0371 

CI>n0372 

CPn0373 

CPn0374 

CPn0375 

CI>n0376 

CPn0377 

CPn0378 

CPn0379 

CPn0380 

CPn0381 

CPn0382 

CPn0383 

CPn0384 

CPn03B5 

CPn0386 

CPn0387 

CPn0388 

CPn03a9 

CPn0390 

CPn03 91 

CPn0392 

CPn03 93 

CPn0394 

CPn0395 

CPn0396 

CPn0397 

CPn0398 

CPn0399 

CPn0400 

CPn0401 

CPn0402 

CPn0403 

CPn0404 

CPn0405 

CPn0406 

CPn0407 

CPn0408 

CPn0409 

CPn0410 

CPn04ll 

CPn0412 

CPn0413 

CPn0414 

CPn0415 

CPn0416 

CPn0417 

CPn0418 

CPn0419 

CPnO420 

CPn04 21 

CPn0422 

CPn04 2 3 



395574 

396893 

397167 

399889 

400459 

401317 

401751 

402012 

405358 

406647 

407825 

409688 

409966 

410528 

411976 

413102 

413790 

414351 

415800 

417147 

417687 

416380 

420218 

421121 

421854 

423438 

426168 

426322 

426758 

429809 

430749 

431693 

432377 

434018 

434525 

435196 

435329 

438134 

439144 

439692 

439814 

440379 

440736 

441964 

444353 

445115 

445533 

445879 

446536 

447884 

448994 

449015 

450887 

451739 

451969 

453742 

454105 

454645 

455123 

455833 

456590 

459203 

460143 

461498 

461856 

463035 

464401 

466834 

467108 

467998 

458242 

468791 



396830 

397135 

398507 

398591 

400109 

400469 

401578 

403817 

403922 

405382 

407055 

407943 

410238 

411544 

412440 

413836 

414107 

4ISS62 

416912 

417503 

418001 

420218 

420961 

421615 

422294 

422347 

423445 

426765 

427876 

428037 

430036 

430749 

431662 

432S22 

434046 

434699 

437320 

437319 

438134 

439510 

440383 

440723 

441968 

443175 

443241 

444381 

445700 

446523 

447306 

447495 

447888 

449710 • 

449871 

450966 

452865 

452858 

454581 

455127 

455833 

456609 

457246 

457227 

459172 

460221 

461557 

462244 

4629S3 

464876 

466624 

467108 

46.8784 

469216 



F 

F 

F 

R 

R 

R 

R 

F 

R 

R 

R 

R 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

R 

R 

F 

F 

R 

R 

R 

R 

R 

R 

R 

F 

R* 

R 

R 

F 

F 

F 

F 

R 

R 

F 

F 

F 

R 

R 

F 

R 

R 

F 

R 

F 

F 

F 

F 

F 

R 

R 

R 

R 

R 

R 

R 

R 

R 

F 

F 



lepA-GTPase- (CT064 } 

gnd-e-Phosphogluconace Dehydrogenase- (CT063) 
tyrS-cyrosyl tRNA Synchecase- (CT062) 
fliA/rpsD-Sigina-28/WhiG Family- (CT061) 
flhA-Fiagellar Secrecion Protein- {CT060) 
fer4-Ferredoxin IV-(CT059) 



CT058 hypothetical protein_2 
cross hypothetical protein_3 



gcpE-(CT057) 

CT0S6 hypothetical protein 



suc3_l-Dihydrolipoainide S uccinyl trans f eras e_l- (CT055) 
sucA-Oxoglutarate DehydrogenAse- (CT054) 
CT053 hypothetical protein 

hemN_l -Coproporphyrinogen III Oxidase_l- {CT052) 
CT32 6 similarity 

yabC/yraL-SAM-Dependent Methytransf erase- {CT048) 
CT047 hypothetical protein 
hcta-Histone-like Protein 2-(CT046) 
pepA-Leucyl Aminopeptidase A-(CT045) 
5sb-SS DNA Binding Protein- (CT044) 
CT043 hypothetical protein 

glgX-Glycogen Hydrolase (debranching) - (0X042) 
CT041 hypothetical protein 
ruvB-Holliday Junction Helicase- (CT040) 

dcd-dCTP Deaminase- (CT03 91 
CT03 8 hypothetical protein 

tlyC_l-CBS Domain protein (Hemolysin Homolog)_l- CCT256) 
CT257 hypothetical protein 
yhfO-NifS-related protein- (CT258) 
PP2C phosphatase family- (CT259 ) 

CT253 hypothetical protein 
CT254 hypothetical protein 
CT255 hypothetical protein 
mutY-Adenine Glycosylase- {CT107) 

yceC-predicted pseudouridine synthetase feimily- (CT106) 
CT105 hypothetical protein 

fabl-Enoyl-Acyl-Carrier Protein Reductase- (CT104 ) 

HAD super family hydrolase/phosphatase- (CT103 ) 

CT102 hypothetical protein 

CT260 hypothetical protein 

dnaQ_l-DNA Pol III Epsilon Chain_l- (CT261) 

CT262 hypothetical protein 

CT263 hypothetical protein 

msbA-Transport ATP Binding Protein- (CT264) 

accA-AcCoA Carboxylase /Transferase Alpha- (CT265) 

CT266 hypothetical protein 

himD/ihf A- Integration Host Factor Alpha- (CT267) 
amiA-N-Acetylmuramoyl Alanine Amidase- (CT268 ) 
murE-N-Acecyimuramoylalanylglutamyl DAP Ligase- (CT269 ) 
pbp3 - transglycolase/transpeptidase- (CT270) 
CT271 hypothetical protein 

yabC-P3P2B Family .methyl cransf erase- (CT272 ) 
CT273 hypothetical protein 
CT274 hypothetical protein 
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CPn0424 


469612 


470961 


y 


CPn0425 


470980 


471564 


f 


CPn0426 


472111 


471536 




CPn0427 


472207 


473715 






473722 


474681 


r 




474681 


475319 


r 




475326 


476093 


F 




476483 


471! 1^1 
4 ' OXSi 


R 




476816 


4 /0314 


R 






A 7<I Q 


R 


CPn0434 


4/94 0^ 


^7'T^^C 
4 / / 270 


R 




A flOOn'7 
40U9U^ 


4/94/3 


R 


CPQ0436 


AQ^ CI a 

4tiXoXo 


^ Q AO A^ 


R 


L>rnU4 J / 


4 O XO 


AO A ^ C A 
40« J9U 


F 


'*OnA J ^ 9 


4 03 4 JL O 


404 J J4 


R 




^ (1 C C G 1 


486077 


F 




40DXU3 


A fxe^ A A 


F 




A A A R Q 1 
40Q0 7 X 


A D*TO ^ O 
49 fOiO 


F 




4aD UXJ 


488528 


F 




400 / 


4o9979 


F 




4 O / 


494507 


F 




4 74 / 7i 


497579 


F 


CPn0446 


4 97020 


500415 


F 




C AA C£ D 


503351 


F 


/*t>n A4 ^ Q 


C A^ a 1 A 
3U4olU 


503698 


R 




C; ATT 7 1 


505330 


R 


^Dm AJ CA 


C AQ Y 1 ^ 


507X80 


R 




C AO^TC 

30O273 


51X058 


F 


CPn04S2 


51X319 


5X2660 


F 




3X J2 J4 


316X32 


F 






5X9115 


F 




o A1 ^ a 
3^U j4o 


519458 


R 




32X3 


C A^ 1*7 

320327 


R 


UirnU4 3 / 


34C Joo3 


522X20 


R 




C7in7n 

3^0 J 


32423o 


R 




3X / UU3 


3^0Dl9 


R 


r^PnAJCA 


^ 77fl A n 
3^ /a4u 


320772 


R 


cpnA^ei 


3aOOJO 


C77n4 A 
32 /044 


R 




33XU3^ 


327037 


R 


^ FilV 4 V J 


^15 T «;7 

3 J* J 3 ' 


CTl 1 O ■» 
33XX71 


R 




3 JA04^ 


332 Job 


R 


CPn04fi^ 

ilW J 


3 J J A XX 


33«a / 1 


R 




33J 


3303 J7 


F 




33 D03 J 


3J74 J4 


F 




<ifi17 
3 J 7 D J X 


C^ A^ T T 
34U4 J2 


F 






C^ 1 A £ A 
34X400 


F 


CPn0470 


3 4X J 3 / 


«i J7<!1^ 
3423 J2 


F 


CPn0471 


542564 


3434UX 


F 


CPn0472 


547905 


w«330X 


R 


CPn0473 


549593 


548070 


R 


CFn0474 


S51573 


549807 


R 


CPn0475 


553844 


55X685 


D 


CPn0476 


554844 


553858 


D 


CPn0477 


556106 


554844 


1^ 


CPn0478 


557625 


556210 


1^ 


CPn0479 


558425 


557616 




CPn0480 


559303 


558650 




CPn0481 


S60946 


559339 




CPn0482 


561737 


560961 




CPn0483 


561836 


564964 


F 


CPn0484 


564970 


565824 




CPn0485 


566038 


566229 


p 


CPn0486 


567784 


566405 




CPn0487 


569740 


568112 


R 


CPn04e8 


570096 


569767 


R 


CPn0489 


570965 


570096 


R 


CPn0490 


571279 


573333 


F 


CPn0491 


574352 


573336 


R 


CPn0452 


574652 


574804 


F 


C?n0493 


575004 


574855 


R 


C?n04S4 


575364 


575146 


R 


CPn0495 


575603 


576793 


7 



dnAA_2 -Replication Initiation Factor_x- (CT275 ) 
CT276 hypothetical proteins 
CT277 similarity 

ngr2-NADH (Ubiquinone) Dehydrogenase- (CT278) 
nqr3-NADH (Ubiquinone) Oxidoreduccase, Gamroa- (cr279) 
nqr4-NA0H (Ubiquinone) Reductase 4-(CT280) 
nqrS-NADH (Ubiquinone) Reductase 5-{CT281) 



gcsH-Clycine Cleavage System H Protein- {CT282) 
CT2S3 hypothetical protein 

Phospholipase D superfamily (uncieavable leader peptide} - (CT28 4) 
IplA-Lipoate Protein Ligase-Like Protein- (CT28 5) 
clpC-ClpC Protease- (CT28 6) 
ycbF-PP-loop super family ATPase- (CT287) 



CT007 hypothetical protein 
CT006 hypothetical protein 
CT005 hypothetical protein 

pmp_6- Polymorphic Outer Membrane Protein G/I Family 

pnp_7- Polymorphic Outer Membrane Protein G Family 

pmp_8- Polymorphic Outer Membrane Protein C Family 

pmp_9- Polymorphic Outer Membrane Protein C/1 Family 

•y'cjG_B6_2 Hypothetical Protein 

pmp_10-PMP_10 (Frame-shift with 0451) 

pmp_10- Polymorphic Outer Membrane Protein G Family 

pmp.ll- Polymorphic Outer Membrane Protein G Family 

pmp.l2 -Polymorphic Outer Membrane Protein A/I Family (truncated) 

pmp_13 -Polymorphic Outer Membrane Protein G Family 

pmp_14- Polymorphic Outer Membrane Protein H Family 



pmp^l 5 -Polymorphic Outer Membrane Protein E Family 

pmp_l 6 -Polymorphic Outer Membrane Protein E Family 

pnp«.17-Poiymorphic Outer Membrane Protein E Family 

pmp_17- Polymorphic Outer Membrane Protein (Frame-shift with 0469) 

pmp_17- Polymorphic Outer Membrane Protein (Frame -shift with 0470) 

pmp_l 8 -Polymorphic Outer Membrane Protein E/F Family 



CT365 hypothetical protein 
glgB-Glucan Branching Enzyme- (CT86 6) 
CT865 hypothetical protein 
•yqeV_Bs Hypothetical Protein 
hflX-GTP Binding Protein- (CT37 9) 
phnP-Hetal Dependent Hydrolase- (CT380) 
CT383 hypothetical protein 

arcJ*Arginine Periplasmic Binding Protein- (CT3 81) 

aroC-Oeoxyheptonate Aldolase- (CT382) 
CT3B2.1 hypothetical protein 
* hypothetical proline permease 
CT384 hypothetical protein 
hitA-HxT Family Hydrolase- (CT385) 
CT386 hypothetical protein 
CT387 hypothetical protein 
CT399 hypothetical protein 



aspC-Aspart&te Aminotransferase- (CT390) 
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CPnO<96 


576793 


577812 


F 


CT391 hypochecicai proc«in 


CPn0497 


578089 


577820 


R 


CT388 hypochecicai proc«in 


CPn0498 


579035 


578085 


R 




CPn0499 


580359 


579205 


R 




CPnOSOO 


580659 


582362 


r 


proS-ProIyi cRNA Synchecase- (CT393> 


CPnOSOl 


582457 


583650 


r 


hrcA-HTH Transcriptional Repressor- (CT39 4) 


CPn0502 


583650 


584201 


F 


grpE-HSP-70 Cofaccor-(CT395) 


CPn0503 


584234 


586213 


F 


dnaK-HSP-70-{CT396) 


CPn0504 


586487 


5B8S14 


F 


vacB-ribonuclease family- (CT3 97) 


CPn0505 


588519 


589106 


F 


* 3 -mechyl adenine DNA glycosylase 


CPnOSOe 


589172 


589840 


F 


CT421 hypochecicai proccin 


CPn0507 


589961 


590122 


F 


CT421.1 hypochecicai proccin 


CPnOSOS 


S90142 


590300 


F 


CT421.2 hypochecicai procein 


CPnO509 


590335 


590808 


F 


{predlcced Mecalloenzyme) - (Cr422) 


CPnOSlO 


590813 


591973 


F 


ciyC_2-CBS Domains <Hemoiysin homolog)_2- {CT423) 


CPnOSll 


S9214I 


592488 


F 


rsbV_l -Sigma Reguiacory Faccor_l-(CT424) 


CPaOS12 


592553 


594412 


F 


CT425 hypochecicai procein 


CPn05l3 


594647 


595753 


F 


Fe-S oxidoreduccase.l- (Cr426) 


CPn0514 


595729 


596520 


F 


CT427 hypochecicai protein 


CPnOSlS 


596492 


597181 


F 


ublE-Ubiguinone Hethylcransferase- (CT428) 


CPn0516 


598B14 


597255 


R 




CPn0517 


599631 


598795 


R 




CPnOSlS 


600803 


599832 


R 


CT429 hypochecicai protein 


CPn0519 


601674 


600904 


R 


dapF-Diaminopimelate Epimerase- (CT430} 


CPa0520 


602218 


601646 


R 


cipP-CLP Protease-(Cr431) 


CPnOS21 


603797 


602241 


R 


giyA-Serine Hydroxymechyi transferase- {CT432 ) 


CPn0522 


603987 


604655 


F 


CT433 hypothetical procein 


CPn0523 


604723 


605052 


F 




CPn0524 


605103 


606179 


F 




CPnOS25 


606522 


607283 


F 


CT398 hypothetical procein 


CPnOS26 


608696 


607710 


R 


yrbH*GutQ/KpsF Family Sugar-P Isomerase- (CT399) 


CPa0527 


609904 


608726 


R 


sucB_2-Dihydroiipoamide Succinyltrans£erase.2- (CT400) 


CPn0528 


611162 


609921 


R 


gicT-Glucamace Symporc- (CT401) 


CPnOS29 


612259 


611165 


R 


ycaK-ATPase- (CT402) 


CPa0530 


613254 


612460 


R 


fipoU_l-rRNA Methyla«e_l-(cr403) 


CPn0531 


614069 


613245 


R 


SAM dependenc mechylcransferase-(CT404) 


CPnOS32 


614674 


614075 


R 


ribC/risA-Riboflavin Synthase- (CT405) 


CPa0533 


614930 


615385 


F* 


CT406 hypothetical protein 


CPn0534 


615413 


61S764 


F 


dJcsA-OnaK Suppressor- (CT4 07) 


CPa0535 


615793 


616296 


F 


ispA-Lipoprotein Signal Peptidase- (CT40 8) 


CPnOS36 


616345 


617691 


F 


dagA.l-0-Ala/Gly Permease.l- (CT409) 


CPn0537 


617833 


618189 


F 


CT814.1 hypothetical protein 


CPnOS3B 


616212 


618S11 


F 


CT814 hypochecicai procein 


CPn0539 


618705 


621545 


F 


pmp.l 9 -polymorphic oucer membrane procein A Family -(CT412} 


CPn0540 


621694 


626862 


F 


pmp_20-poiymorphic oucer membrane procein B Family- (CT413} 


CPn0541 


627170 


628003 


F 


Solute binding protein (-yebL-Synechocystis Adhesin Homolog) - (CT41S) 
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628003 


628737 


F 


ABC Transporter ATPase- {CT416) 
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F 


(Metal Transport Protein) -(CT417) 
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R 


yhbZ-GTP binding procein- {CT418) 
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R 


ri27-L27 ribosomal protein- (CT419) 


CPn0546 
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R 


rl21-L21 Ribosomal Procein- (CT420) 
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■ F 


ygbB family- (CT434) 
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R 


cysJ-Suifice Reductase- tCT43 5) 


CPn0549 


633569 
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R 


rslO-SlO Ribosomal Procein- (CT436) 
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635661 
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R 


£usA-Elongation Paccor G-(CT437) 
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636166 


635698 


R 


rs7.S7 Ribosomal Procein- (CT43 8) 
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636567 


636219 


R 


rsl2-S12 Ribosomal Procein- (CT4 39) 
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637747 
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R 
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F 


CT440 hypocheticai procein 
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R 


crpA-15)cDa Cysteine-Rich Protein- (CT4 42) 
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omcB-60)cDa Cysteine-Rich Outer Membrane Cosiplex Protein- (CT443) 
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643031 


R 


omcA-9)cDa-Cysteine-Rich Outer Membrane Complex Lipoprotein- (CT4 44) 


CPn0559 


643742 
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CT441.1 hypothetical procein 
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64S612 


644096 


R 


gicX-Ciutamyi-cRNA Synchecase- (CT445) 


CPnOSSI 


646404 


645871 


R 


euo-CHLPS Euo Protein- (CT446) 


CPn0562 


648036 


646918 


R 


*CKLPS 43 IcOa protein homoiog_I 


CPn0563 


650056 


648293 


R 


recJ-ssONA Cxonuclease- (CT447) 


CPnOS64 


654350 


650145 


R 


secOtsecF-Protein Export Proteins SecD/SecF ( fusion) - (CT448) 


CPnOS65 


655630 


654533 


R 


CT449 hypochecicai- protein 


CPnOS66 


656141 


656890 


F 


yaeS family- (C?4 50) 


CPn0567 


656894 


657817 


F 


cdsA- Phosphac idace Cy tidy ly trans f erase- ( CT45 1 ) 
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725082 


724750 


R 


CPn0637 


725464 


725099 


R 


CPn063B 


725747 


725490 
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cd»A-Phofphatidate Cycidyiycranaf era«tt- (CT452) 
piaC-Ciyceroi-3-P Acyicran«iera»e- iCT453 ) 
argS-Arcrinyl tRNX Trans f erase- (CT4 54 ) 
murA-UOP-N-Acecyiffiucosamine Transferase- {CT455) 
CT456 hypochecicai procein 
yebC fandiy-(CT457) 

YhhY-Amino Group AcecyX Transferase- {CT458) 

prfB-Pepcide Chain Release Factor 2 tnacural OCA frame-shifc ).(CT45« 
prffl- (natural UCA frawe-shiCc ) 
SWIB (YM74) complex protein- (CT460) 
yael-phosphohydroiase- {CT461 ) 

ygbP/yacM-Sugar Nucleotide Phocphorylase- (CT462) 

tniA-Pseudouridylate Syntliase r-(CT463) 

Phosphogiycolate Phosphatase- (CT464) 

CT465 hypothetical protein 

CT466 hypo Che ticaX protein 

atoS/ntrfl-2-Coinponentt Sensor- {CT4 67) 

•sittiXarity to Cps lncA^2 

atoc/ntrC-2-Coaiponent Regrulator- (CT468) 

•yvyO^Bs conserved hypothetical protein 

CT469 hypotheticaX protein 

CT470 hypothetical protein 

CT47X hypothetical protein 

ya^E family- (CT472 J 

yidO family- (Cr473) 

CT474 hypotheticaX protein 

pheT-phenylalanyX tRNA Synthetase Beta-(cr475) 

CT476 hypothetical protein 

ada -me thyX transferase- (CT477) 

oppC_2 -Oligopeptide PerTnease_2- (CT478> 

oppB_2 -Oligopeptide Pennease_2- (CT479) 

opp\-5 -oligopeptide Binding Lipoprotein's- (CT4 80) 

CT483 hypotheticaX protein 
CT484 hypotheticaX protein 
hemZ-Ferrochetalase- {CT485) 
fliy-Glutamine Binding Protein- (CT486) 
yhhF-Methylase -(Cr487) 
CT488 hypothetical protein 
glgC-CXucose-1-P AdenyXcrans£erase-(CT489) 

'pyrF-Uridine 5 • -Monophosphate Synthase (Unip Synthase) -truncated? 

CT490 hypotheticaX protein 

rho-Transcripcion Termination Factor- (CT49X) 

yacE-predicted phosphatase/ Jcinase- {CT492) 

poXA-DNA Polymerase X-(CT493) 

sohB- Protease- (CT494 ) 

adt_2-ADP/ATP TransXocase_2-(CT495) 

pgsA_X-CXyceroX-3-P Phosphatidyitransf crase.l- (CT496) 

dnaB-RepXicative DNA HeXicase- (CT497I 

gidA-FAO-dependent oxidoreductase- (CT49B) 

XpXA-Lipoate-Proteia Ligase A-(CT499) 

ndJc-NucXeoside-2-P Kinase- (CT5 00 > 

ruvA-HoXXlday Junction HeXicase- (CT50X) 

ruvC -Crossover Junction EndonucXease- (CT502) 

CT503 hypotheticaX protein 

CT504 hypotheticaX protein 

gapA-Clyceraldehyde-3 -P Dehyrogenase- (CT50S ) 

rll7-L17 Ribosomal Protein- (CT506) 

rpoA-RNA Polymerase Alpha- (CT507) 

rslX-Sll Ribosomal Protein- (CT5 08) 

rsl3-513 Ribosomal Procein- (CT509) 

secy-Translocaso- (CT510) 

rlX5-LX5 RibosomaX Protein- (CT5 XX) 

rs5-S5 Ribosomal Protein- (Cr512) 

rllB-LXB RibosomaX Protein- (CT5X3) 

rl6-L6 Ribosomal Protein- (CT514) 

rs8-S8 Ribosomal Protein- (CT515) 

rl5-L5 Ribosomal Protein- (CT516) 

rl24-L24 Ribosomal Protein- tcr517) 

ril4-L14 Ribosomal Procein- {CT518 ) 

rsl7-si7 Ribosomal Protein- {CT519) 
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727713 
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729621 

730331 

731603 
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733501 
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742923 

744190 
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752765 

753630 
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761320 
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771404 
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773452 
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776256 
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780216 

781769 
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793683 
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783447 

784201 

784721 
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795742 

796210 
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rl29-L29 Ribosomal Procein- (CT520) 
ril6-Ll6 Ribosomal Procein- {CT521 ) 
rs3-S3 Ribosomal Procein- {CT522 ) 
rl22-L22 Ribosomal Procein- (CT523 ) 
rsl9-S19 Ribosomal Procein- (CT524 ) 
ri2-L2 Ribosomal Procein- fCT525 ) 
rl23-L23 Ribosomal Procein- (CTS26) 
ri4-L4 Ribosomal Procein- (CTS27) 
rl3-L3 Ribosomal Procein- tCTS28 ) 
CTS29 hypochecical procein 

fmc-Mechionyl CRNA Formylcransf erase- {CT530) 
ipxA-Acyl -Carrier UDP-GlcNAc -(CTS31) 
fabZ-Myriscoyl-Acyl Carrier Dehydratase- {CTS32) 
IpxC-hfyriscoyl GlcNac Deacec/lase- (CTS33) 
cucE-Apolipoprocein N-Acecylcransf erase- (CT534 ) 
vdlD/yciA-acyl-CoA Thioescerase- {CTS35) 
dnaQ_2-DNA Pol III Epsilon Chairu2- (CT536> 

yjeE fATPase or Kinase) - {CT537) 
CTS38 hypochecical procein 
crxA-Thioredoxin- (CT539} 
spoU_2-rRNA Mechylase.2-{CT540) 

mip-FKBP-cype pepcidyl-prolyl cis-crans isoinerase-(CT541) 
aspS-Asparcyl CRNA Synchecase- (CT542) 
hisS-Hiscidyl CRNA Synchecase- (CT5 43) 

uhpC-Hexosphosphace Transporc -(CT544) 
dnaE-DNA Pol III Alpha- {CT545) 
predicced OMP fleader (17)-(CT546) 
CT547 hypochecical procein 
CT54 8 hypochecical procein 

rsbw-sigma regulacory faccor-hiscidine Icinase- (CT549) 
CT550 hypochecical procein 

dacF (pbp5 > -D-Ala-D-Ala Caroxypepcidase- {CT551 ) 
CT552 hypochecical protein 
fmu-RNA Mechylcransf erase- CCT553) 
CT696 hypochecical protein 
homologous to CT695 



pgk-Phosphoglycerace Kinase- tcr693) 
ygo4-Phosphace Permease- {CT692 ) 
CT691 hypochecical procein 

dppD-ABC ATPase Dipepcide Transporc- (CT690) 
dppF-ABC ATPase Dipepcide Transporc- {CT68 9) 
spoJ/parB -Chromosome Parcicioning Procein- (CT688) 



CT482 hypochecical protein 
CT481 hypochecical procein 

yfhO.l-NifS-relaced Aminocransferase.l- (CT687) 
ABC Transporcer Membrane Procein- (CT686) 
abcX-ABC Transporcer ATPase- (CT685) ' 
ABC Transporcer- tCT684) 

TPR Repeacs (0-LinJced ClcNAc Transferase similarity) - CCT68 3 ) 

pbp2-PBP2-cransglycolase/cranspepcidase- fCT682 ) 

ompA-Major Outer Membrane Procein- (CT681) 

rs2-S2 Ribosomal Protein- (CT680) 

tsf -Elongation Factor TS-(CT679) 

pyrH-UMP Kinase- (CT67 9) 

rrf-Ribosome Releasing Factor- (CT677) 

CT676 hypothetical procein 

)carG-Arginine Kinase- (CT675> 

yscC/gspD-Yop C/Gen Secrecion Procein 0-(CT674) 
p)cn5-S/T Procein Kinase- {CT673) 

fliN- Flagellar Mocor Swicch Oomain/YscQ family- {CT672 ) 

CT671 hypochecical procein 

CT670 hypochecical procein 

yscN-Yop N (Flagellar-Type ATPase) - (CT669) 

CT668 hypochecical procein 

CT667 hypochecical protein 

CT666 hypochecical procein 
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845629 


845006 


R 


CPn0750 


846411 


845707 


R 



CPn0751 


846608 


848434 


F 


CPn0752 


848604 


850082 


F 


CPn0753 


851006 


850161 


R 


CPn0754 


851336 


851040 


R 


CPn0755 


851597 


852799 


F 


CPn0756 


852961 


854676 


F 


CPn0757 


854733 


855134 


F 


CPn0758 


855110 


856459 


F 


CPn07S9 


656488 


856997 


• F 


CPn0760 


856957 


857694 


F 


CPn0761 


857704 


858375 


F 


CPn0762 


859597 


858539 


R 


CPn0763 


860511 


859972 


R 


CPn0764 


861807 


860524 


R 


CPn0765 


862382 


661801 


R 


CPn0766 


863782 


862394 


R 


CPn0767 


863884 


864177 


F 


CPn0768 


864159 


665163 


F 


CPn0769 


867733 


865121 


R 


CPn0770 


868340 


869131 


F 


CPn0771 


870463 


869144 


R 


CPn0772 


872385 


670469 


R 


CPn0773 


872488 


873195 


F 


CPn0774 


873195 


873425 


F 


CPn0775 


874031 


673414 


R 


CPn0776 


674246 


875487 


F 


CPn0777 


875601 


877178 


F 


CPn0778 


877505 


878092 




CPn0779 


878431 


878095 


R 



CT665 hypochetical protein 

FHA domdin: homology to adenylate cyclase) - (CT664 ) 

CT663 hypothetical protein 

henvA-GiutaJoyl tRNA Reductase- {CT662 ) 

gyra_2-DNA Cyrase Subunit 8_2-rCT661) 

gyrA_2-DNA Cyrase Subunit A_2- (CT660) 

CT656 hypothetical protein 

CT657 hypothetical protein 

sfhB- (Pseudouridine Synthase) - (CT658) 

CT659 hypothetical protein 

JcdsA-KDO Synthetase- (CT655) 

CT554 hypothetical protein 

yhbG-ABC Transporter ATPase-(CT653) 

CT652.1 hypothecical protein 

CT620 hypothetical protein 

CT619 hypothetical protein 

CHLPN 76JcDa Homolog.l (CT622) 

CHLPN 76kDa Hoinolog_2 (CT623) 

mviN-Integral Membrane Protein- (CT624) 

nfo-Endonuciease IV-(CT625) 
rs4-S4 Ribosomal Protein- (CT62 6) 
yceA-CCT627) 

•pyrH/ud3c- Uridine Kinase (Uridine Monophospho kinase) (Pyrimidine 

Ribonucleoside Kinase) . 
ygeD-Efflujc Protein- {CT641) 
recC-Exodeoxyribonuclease V, Ganrta- (CT640) 
recB-Exodeoxyribonuc lease V. Beca-(CT639) 
CT638 hypothetical protein 
tyrB-Aromacic AA Aminotransferase- (CT63 7) 
greA- Transcript ion Elongation Faccor- {CT636) 
CT635 hypothetical protein 

nqrA-Ubiquinone Oxidoreductase« Alpha- {CT634) 
hemB- Porphobilinogen Synthase- (CT63 3 ) 

CT632 hypothetical protein 

CT631 hypothetical protein 

CT631 hypothetical protein (frame-shift) 

ispA-Geranyl Transtr&nsferase* (CT628) 

glmU-UOP-GlcNAc Pyrophosphorylase- (CT629) 

tctD/cpxR-HTK Transcriptional Regulatory Protein Receiver Doman- 
(CT630) 

CT651 hypothetical protein 

recD_2-Exodeoxyribonuclease V, Alpha_2- (CT652) 

rs20-S20 Ribosomal Protein- (CT617) 

CT616 hypothetical protein 

rpoD-RNA Polymerase Sigma-66 -(CT615) 

folX-Dihydroneoptcrin Aldolase- {CT614) 

folP/dhpS-Dihydropteroate Synthase- (CT613) 

f olA-Dihydrof olate Reductase- (CT612 ) 

CT611 hypothetical protein 

CT610 hypothetical protein 

recA-RecA recombination protein- (CT650) 

ygf A-Fonny It etrahydro folate Cycloligase- (CT649) 

CT648 hypothetical protein 

CT647 hypothetical protein 

CT646 hypothetical protein 

CT645 hypothetical protein 

yohI/nir3 -predicted oxidoreductase -(CT644) 

topA-DMA Topoisomerase I-Fused to SWI Domain- (CT64 3) 

CT642 hypothetical protein 

rpoN-RKA Polymerase Sigma-54- (CT609) 

uvrD-DNA Helicase-(CT608) 

ung-Uracil DNA Glycosylase- tCT607) 

CT606.1 hypothetical protein 

yggv family- (CT606) 

CT605 hypothetical protein 

groEL_2-heat shoc)c protein-60 -tCT604) 

tsa/ahpC-Thio-specific Antioxidant (TSA) Peroxidase- (CT603) 
CT602 hypothetical protein 
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CPn07flO 


879205 


378591 


R 




CPn07ai 


879773 


379198 


R 


pai-Pepcidoglycan-Associaced Lipoprotein- tCT600) 


CPn0782 


B81065 


879773 






^rnu f 0 J 


O O X Q □ ^ 


a 0 X X w u 




wi37o iiypouncLxcai procexn 


urnu /tf 4 


882296 


681892 


B 
A 


exbD-BxopoIymer Transport Protein- (CT597) 




O (3 ^ 7 7 X 


882296 


n 
K 


exba/tolQ-polysaccharide transporter- (CT596) 


^&MnT D c 

CrnU /o o 


Afl >1 flC 
0 0 J X 0 3 


0 11 c ? 0 1 

0 0 3^ 7 J 


r 


dsbO/xprA-Thio: disulfide Interchange Protein- fCT595 ) 


CPn0787 


O 0 C £ 1 Q 


fl Q£^ A 1 
0O04Ux 


F 


yabO/ycf.H-PHP superfamily (xirease/pyrimidinase) hydrolase- (CT594 ) 


CPn078B 


aB0942 


oo74 


F 


sdhC-Succinate Dehydrogenase- (CT593) 


CPn07B9 


8874 39 


0 a 0 ^ 1 £ 
0 07 J X.O 


F 


sdhA-Succinate Dehydrogenase- {CT592) 


^ r\ A ^ n n 


D a Q 1 ^ A 
007 J JO 


Q 0 A Y A ^ 
07U JLO J 


F 


sdhfl-Succinate Dehydrogenase- (CTS91) 




DO lAQA 
O 7 jU3w 


D 7U XXX 


D 


CTS90 hypothetical protein 




0747l7 


aoii Aft 
a 7 J X wo 


R 


CT589 hypothetical protein 


<~ni ■ m a ^ 
C«Tlu7y J 


896823 


0 0 it Q 1 Q 
0747X7 


R 


rbsU-sigma regulatory family protein— PP2C phosphatase (RsbW 
antagonist) - (CTS63 ) 


CPn0794 


07 lXf% 


D Q fl A A >l 
0 70004 


F 






07d12o 


a 1 DC 
0 77X73 


F 






a a a '4 ai 


901340 


F 




CPn0797 


a a 1 ca a 


902694 


F 






Q A^ O J C 
7U* 04 O 


Q Al 0 C £ 
7aj 030 


F 






a a ^ a a f 
904700 


a a ^ A ^ a 

903940 


R 




^ a a a A 

CPnoBOu 


906532 


905249 


R 


eno-Enolase- tCT587) 




a a a £ o*t 
90OD 97 


906727 


R 


uvrB-£xinuc lease ABC Subunit B-(CT586) 


r^DMAfi A 1 


Q AO ^ ^ A 


□ A 0 ^ A Q 
706707 


R 


crpS-Tryptophanyl CRNA Synthetase- (CT58 5) 




910303 


909752 


R 


CT584 hypothetical protein 


^OwA aa A 


9110S9 


910310 


R 


gp6D-CHLTR Plasmid Paralog- (CT583 ) 




911831 


9X1067 


R 


lainO-chromoscme partitioning ATPase-CHLTR plasraid protein GP5D-(CTS82) 


^D*»A O A C 


913771 


9X1867 


R 


thrS-Threonyl tRNA Synthetase- (CTSBl) 


A D A ^ 


913 971 


914879 


F 


CT580 hypothetical protein 


W CTtU BUS 


7io^a / 


914956 


R 


CT579 hypothetical protein 


r^DnAP A a 


91770D 


916307 


R 


CT578 hypothetical protein 




01 fl 1 ft J 


7l7o25 


R 


CT577 hypothetical protein 


CPnOfll 1 


7XO 7U U 


7XO«UO 


R 


lcrH_l-Low Ca Response Protein H_1-(CT576) 


CPnOfll 2 




7^UO 0^ 


F 


ntutL-uNA Mismatch Repair- (CT57 5) 


CPnOSIl 


920870 


9?1 Q \A 
7 aX7 J4 


F 


pepF-Amxnopeptxdase P-iCT574i 




922107 


7 A J J 3 / 


F 


CT573 hypothetical protein 


CPnOSlS 


923361 


925622 


r 


gspu/pixQ-oen. secretion protein D-(CT572] 


V* * U W O X o 


92S61S 


7 A /xu^ 


F 


gspE-Gen. Secretion Protein E-(CT571) 


CPnOfll "7 


7^ / 


Q") 


F 


gspF-Gen. Secretion Protein F-{CT570J 


CPndfll fl 


9*0 JX4 


a9ft£a^ 

7«00O2 


F 


predicted OMP (leader (16) peptide) - {CT569) 




928689 


9591 T> 

7«7X J« 


F 


CT568 hypothetical protein 


CPn0820 


929120 


929659 




v>4Jof nypv&neuxcax pxouexn 


CPn0821 


929667 


930663 




* ^ u 9 iij^^w wticux w AX pxwwaxn 


CPn0822 


930756 


931229 
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931501 
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CPnOa24 
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R 


kxxv ivi^^/ XXXV 1 X aiiaxocau 1 On trxotexn— ^wx^o j / 


CPn0825 


933594 


932677 
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CPn082S 


934310 


933612 


R 


yscL-Yop Translocation L-(CT561) 


CPn0827 


93S264 


934434 


R 


CTS60 hypothetical protein 


CPn082S 


936271 


935267 


R 


yscJ-Yop Translocation J-(CT559) 


CPn0829 


936744 


937296 


F 




CPnOS30 


937444 


937959 • 


F 




CPn0831 


938267 


936434 


F 




CPn0832 


939747 


938827 


R 


lipA-Lipoate Synthetase- {CTSS 8) 


CPn0833 


941129 


939747 


R 


IpdA-Lipoamide Dehydrogenase- (CT 5 57) 


CPn0834 


941SS3 


942014 


F 


CTSS 6 hypothetical protein 


CPn083S 


94S6B9 


942045 


R 


motl_l-SWI/SNF family helicase_l- (CT555 ) 


CPn0836 


946879 


945722 


R 


bmQ-Amino Acid (Branched) Transport- (CT554) 


CPn0837 


947771 


947145 


R 


nth -Enodnucl ease III-(CT697) 


CPn0838 


949106 


947761 


R 


chdF-Thiophene/Furan OKidation Protein- (CT698) 


CPn0839 


949257 


950159 


F 


psdD-Phosphatidylserine Decarboxylase- (CT699) 


CPn0840 


950222 


951544 


F 


CT700 hypothetical protein 


CPn0a41 


951731 


954640 


F 


secA_2-Translocase SecA.2- (CT701) 


CI>n0842 


954883 


954710 


R 


CT702 hypothetical protein (frame-shift vith 0843) 


CPn0843 


955191 


954994 


R 


CT702 hypothetical protein 


CPnOS44 


956730 


955270 


R 


yphC-GTPase/crrP -binding protein- {CT703 ) 


CPn0845 


958079 


956850 


R 


pcn3_l-Poly A Polymerase_l- (CT704) 


CPn0846 


959374 


958112 


R 


clpX-CLP Protease ATPase- (CT705 I 


CPn0847 


95999S 


959387 


R 


clpP-CLP Protease Subunit- ICT706) 


CPn0848 


961502 


960177 


R 


tig/murl-Trigger Faccor-peptidyl-prolyl ieomerase-(CT707) 


CPn084 9 


9617B8 


965285 


F 


motl.2-SWI/SNr family helicase_2- tcr708 ) 


cpncaso 


965293 


966390 


r 


mreB-Rod Shape Protein-Sugar )Cinase- tCT709 ) 
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CPn0908 
CPn0909 
CPn0910 
CPn0911 
CPn09l2 
CPn0913 
CPn09l4 
CPn0915 
CPn0916 
CPn09l7 
CPn09l8 
CPn09l9 
CPn0920 
CPn092l 



1041589 
1041637 
1041979 
1044043 
1044129 
:045760 
1045999 
1046461 
1046837 
1048090 
1049223 
1049378 
1051405 
1C5153S 



1040780 
1041966 
1043004 
1042985 
1045760 
1045945 
1046397 
1046817 
1048084 
1048539 
1048579 
1050430 
1050431 
1052293 



n Dn n a ^ 1 
(.rnuo 3 i. 


9663 96 


968195 


F 




968316 


970613 


r 




9706 3 7 


971803 


F 


CPn0854 


972837 


971806 


H 


CPn0855 


973995 


972994 


R 


C?n0856 


975377 


973995 


R 


CPn0857 


975757 


975392 


R 


cpnoass 


977055 


975757 


R 


CPn0859 


977588 


977055 


R 


CPn0860 


978630 


977608 


R 


CPn0861 


979722 


978925 


R 


CPn0862 


980873 


979722 


R 


CPnOB63 


981514 


980831 


R 


CPn0864 


981670 


982374 


F 


CPn0865 


982418 


982942 


F 


CPn0866 


983491 


982916 


R 


CPn0867 


983423 


984667 


F 


CPn0868 


986643 


984670 


P. 


CPn0869 


987401 


986658 


I. 


CPn0870 


988723 


987448 


r. 


CPn0871 


988772 


989899 


F 


CPn0872 


989963 


991216 


F 


CPn0873 


991233 


991694 


F 


CPn0874 


993107 


991749 


F 


CPn087 5 


993372 


994022 


F 


CPn0876 


994144 


995517 


F 


CPn0877 


995533 


995982 


F 


CPn087S 


996654 


995992 


F 


CPn0879 


997439 


996645 


R 


CPnOSSO 


999861 


997444 


R 


CPn0881 


1005667 


1006209 


F 


CPn0882 


1006268 


1007404 


F 


CPn0B83 


1008865 


1007573 


R 


CPn0884 


1009359 


1009009 


R 


CPn0885 


1010635 


1009433 


R 


CPn0886 


1011276 


1010908 


R 


CPn08S7 


1011692 


1014157 


F 


CPnOBBS 


1015423 


1014119 




CPnOo69 


1016835 


1015462 


R 


uPnUB90 


1017805 


1016819 


R 


CPnQ891 


1021073 


1017819 


R 


CPnUd92 


1023661 


1021046 


R 


A O O 


1023894 


1025888 


F 




1026766 


1025888 


R 




1026986 


1027557 


F 


f^Dnrt Q a c 


1027595 


1027822 


F 


wrnu07 / 


1028737 


1027853 


R 


CPn0898 


1030460 




R 


CPn0899 


1030875 


1032215 


F 


CPn0900 


1032235 


1033281 


F 


C?n0901 


1033287 


1034537 


F 


CPn0902 


1034543 


1035241 • 


F 


CPn0903 


1035263 


1036417 


F 


CPn0904 


1036326 


1037396 


F 


CPn0905 


1037409 


1039835 


F 


CPn0906 


1040340 


1039915 


R 


CPn0907 


1040780 


1040445 


R 



R 
F 
F 
R 
F 
F 
F 
F 
F 
F 
R 
F 
R 
F 



pckA-Phosphoenolpyruva-e Carboxyicinase- { CT710) 

CT711 hypothec icai protein 

CT712 hypocheticai protein 

ompB-Oucer Membrane Protein B-{CT713) 

gpdA-Glycerol-3-P Dehydrogenase- (CT714 ) 

AaX-1 Homolog-UDP-Ciucose Pyrophosphoryiase- (CT715) 

CT716 hypochecicai protein 

fliX-Flagellum-specific ATP Synthase- {CT717 ) 
CT718 hypochecicai protein 
fliF-Flageliar M-Ring Protein- {CT719 ) 
nifU-NifU-related protein- (CT720) 
yfhO_2-NifS-relaceci protein_2- (CT721) 
pgmA-Phosphogly cerate Mucase- (CT722} 
yjbC-predicted pseudourldine synthase- (CT723) 
CT724 hypochecicai protein 
birA-Biocin Synthetase- (CT72 5 ) 
rodA-Rod Shape Protein- (CT72 6) 

zntA/cadA-Metal Transport P-cype ATPase- {CT727) 
CT728 hypochecicai protein 
serS-Seryl CRNA Synthetase_2- (CT729) 
ribD-Ribo flavin Deaminase- (CT730) 

ribA&ribB-GTP Cyclohydratase & DHBP Synthase -(CT731) 
ribE-Ribityllumazine Synthase- (CT73 2) 
CT733 hypothetical protein 
CT734 hypochecicai protein 

<iaffA^2-D- Alanine /Glycine Permease_2- (CT735) 

ybcL family (CT7361 

SET Domain protein- (CT737) 

yycJ-mecal dependent hydrolase- fCT73 8 1 

ftsK-Cell Division Protein FtsK-(CT73 9) 



dmpP/nqr6-Phenolhydrolase/NADH ubiquinone oxidoreduccase-{CT740) 

CT741 hypochecicai protein 

ygcA-rRNA Mechylcransf erse- CCT742) 

hctA-Hiscone-LiJce Deveiopmencal Protein- (CT743 ) 

CHLTR possible phosphoprotein- CCT744 ) 

hemG-protoporphyrinogen Oxidase - (CT745) 

hemN_2-Coproporphyrinogen III Oxidase_2- {CT746) 

hemE -Uroporphyrinogen Decarboxylase- {CT747) 

m£d-Transcripcion-Repair Coupling- (CT748) 

alaS-Alanyl tRNA Synthetase- (CT74 9) 

CJCCB -Trans Jcetolase- tCT750) 

amn-AMP Nucleosidase- (CT751) 

efp_2- Elongation Factor P_2-(CT752) 
Cr753 hypothetical protein 

(possible phosphohydroiase) -{CT754) 
Micochondrial HSP60 Chaperonin Homolog- {CT755) 
murF-Muramoyl-DAP Ligase- (CT756) 
mraY-Muramoyl-Pencapeptide Trans f erase- {CT7 57) 
murD-Muramoylalanine-Clucamace Ligase- (CT758) 
nlpD-Muramidase (invasin repeat family) - (CT75 9) 
ftsW-Cell Division Protein FcsW-(CT760) 
murG-Peptidoglycan Transferase- (CT761) 

murCtddlA-Muramate-Ala Ligase & D-Ala-D-Alam Ligase- (Cr762) 
CT763 hypothetical protein 

•cucA Periplasraic Divalent Cacion Tolerance Protein CutA tC-Type 

Cytochrome Biogenesis Procein) 
CT764 hypochecicai protein 
rsbV_2 -Sigma Factor Regulator_2- (CT765) 
miaA-tRKA Pyrophosphate Transferase- (CT766 ) 
Fe-S cluster oxidoreduccase_2- {CT767) 
Cr768 hypochecicai procein 



ybea-iojap superfamily ortholog- (CT769) 
fabF-Acyl Carrier Protein Synthase- (CT770) 
hydrolase/phosphacase homolog- (CT771) 
ppa- Inorganic Pyrophosphatase- (CT772) 
Idh-Leucine Dehydrogenase- (CT77 3 ) 

cy60-Sui:i:e Synchesis/biphosphate phosphatase- {CT774) 
snClyceroi-3-P Acy 1 trans f erase- (CT77 5) 
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1085474 
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CPn0952 


1087478 


1087723 


F 


CPn0953 


1087742 


1088248 


F 


CPn0954 


1088286 


1088708 


F 


i^Tu 1 A A f C 

CPn095 5 


1088612 


^ A A A « ^ ^ 

1089175 


F 


CPn0956 


^ n a A e f /\ 

1089560 


1090909 


F 


CPn0957 


4 A n ^ T O D 

1093788 


1090963 


R 


CPn095o 


1094735 


1093793 


R 


CPn09b9 


1096343 


1 A A ^ ^ A A 

1094799 


R 






1 AQ^^ A"* 

1097102 


F 


CPn0961 


1097118 


1097297 


F 


CPn0962 


1097316 


1098275 


F 


CPn0963 


1098398 


1103224 


F 


CPn0964 
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CPn0968 


1109895 
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1113461 
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CPn0971 


1114702 


1115415 • 
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CPn0972 
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CPn0973 


1116370 


1117527 
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1118422 
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1119637 
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1123693 


F 


CPn0979 


1123960 


1125443 


F 


CPn0980 


1126982 


1125504 


R 


CPn0981 


1127031 


1129952 


F 


CPn0982 


1131194 


1129962 


R 


CPn0983 


1132000 


1131206 


R 


CPn0984 


1132379 


1135510 


F 


CPn098 5 


1135534 


1136571 


F 


CPn0986 


1136724 


1137395 


F 


CPn0987 


1137516 


1138115 


r 


CPn0988 


1138986 


1138075 


R 


CPn0989 


1139495 


1139016 


R 


CPn0990 


1139883 


1140440 


F 


CPn0991 


1140421 


1140612 


F 



4as-Acy IglycerophosphoetAanoiamine AcyA cransfarase- (CT776 ) 

bioF^l -Oxononanoace Synthaee_l- (CT777) 

priA-Primosomal Procein N* -{CT778> 

CT779 hypochecical procein 

Thioredoxin Disulfide Isomerase- (CT7801 

•CKLPS 43 JcDa procein hoinolog_2 

•CHLPS 4 3 )cDa procein homolog_3 

*CHLPS 4 3 )cDa procein homoiog_4 

lysS-Lysyl CRNA Synchecase- (CT781) 
cysS-Cysteinyl CRNA Synchecase- (CT732) 
predicced disulfide bond isomerase- (CT783} 
mpA-Ribonuclease P Procein Componenc- fCT784) 
rl34-L3 4 Ribosomal Procein- (CT78 5) 
rl36-L36 Ribosomal Procein- (CT766} 
rsl4-S14 Ribosomal Procein- (CT7 87) 

CT7a8 hypochecical procein -(leader (50) pepcide-periplasmic] 

CT790 hypochecical procein 

uvrC-Excinuclease ABC, Subunic C-(CT791) 

mucS-DNA Mismacch Repair- {CT79 2) 

dnaC/priM-DNA Primase- {CT794 ) 

CT794.1 hypochecical procein 

CT795 hypochecical procein 
glyQ-Clycyl CRNA Synchecase- (CT79 6) 

pgsA_2 -Glycerol -3 -P-Phosphacydylcransferase_2-{CT797) 

glgA-Glycogen Synchase- (CT798) 

ccc-General Scress Procein- (CT799) 

pch-Pepcidyl CRNA Hydrolase- (CraOO) 

rs6-S6 Ribosomal Procein- (CT801) 

rsl8-S18 Ribosomal Procein- {CT802 ) 

rl9-L9 Ribosomal Procein- {CTa03 ) 

ychB- Predicced Kinase- (CT804 ) 

{ frame-Shi fc with 0954) 

CT805 hypochecical procein 

ide/ptr-Insulinase family/Procease iri-{CT806) 

plsB-Glycerol-3-P Acylcransf erase- (CT807) 

cafE-AJcial Filamenc Procein- (CT808 ) 

CT809 hypochecical procein 

rl32-L32 Ribosomal Procein- (CT810 ) 

plsX-FA/Phospholipid Synchesis Procein- (CT8 11) 

pnp_21-Polyinorphic Queer Membrane Procein D Family- (CTai2) 

IpxB-Lipid A Disaccharide Synchase- (CT4 11) 
pcnB_2-PolyA Polymerase.2- (Cr410) 
mrsA/pgm-Phosphoglucomucase- (CT815) 

glmS-Glucosamine-Fruccose-6-P Amino crane f erase- fCT816) 

0969- cyrP_l -Tyrosine Transporc_l-{CT817) tyrP^l -Tyrosine Transporc^l - 
(CT817) 

097 0- tyrP_2 -Tyrosine TransporT:_2- (CT818) cyrP_2 -Tyrosine Tranflport_2- 
(CT818) 

yccA-Transporc Permease- (CT81 9) 
fCsY-Cell Division Procein Fcsy-(CT820) 
sucC-Succinyl-CoA Synchecase. Beca-(Cr821) 
sucD-Succinyl-CoA Synchecase, Alpha- (CT822) 



hcrA-DO Serine Procease- (CT823) 

•similaricy to Saccharomyces serevlsiae hypochecical 52.9KD procein 
Zinc Mecalloprocease (insulinase family) - (CT824} 
yigN family- (CT82S) 

pssA-Glycerol-Serine Phosphacidylcransf erase- (CT826) 
nrdA-Ribonucleoside Reduccase. Large Chain- (CT627 ) 
nrdB-Ribonucleoside Reduccase. Small Chain- (CT828) 
yggH-predicced rRNA Mechylase- (CT82911 
ycgB-li)ce predicced rRNA mechylase- (CT830) 
murB-UOP-N-Acecylenolpyruvoylglucosamine Reduccase- (CT831) 
CT632 hypochecical procein 
in£C-Iniciacion Faccor 3-(CT833) 
rl35-L35 Ribosomal Protein- (CT834) 
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C?n099: 


II40634 


1140996 


F 


C?n0993 


I14I014 


1142030 


F 


CPn0994 


1142398 


1144440 


F 


CPn0995 


1145512 


1144415 


R 


CPn0996 


1146589 


1145519 


R 


CPn0997 


1146708 


1147664 


F 


C?n0998 


1147855 


1150584 


F 


CPn0999 


1152847 


1150766 


R 


CPnlOOO 


1153157 


1152891 


R 


CPnlOOl 


1153405 


1153869 


F 


CPnl002 


1153862 


1154089 


F 




1154796 


1154092 


R 


CPnI004 


1155397 


1154879 


R 


CPnlOQS 


1155933 


1155415 


R 


e^PnlOOfi 

V W W 


1156472 


1155990 


R 


CPnl007 


1156689 


1156907 




CPniOOS 


1156928 


1158223 




CPnl009 


1159058 


1158186 


R 


CPnlOlO 


1159672 


1159067 


R 


CPnlOll 


1160306 


1159902 


R 


CPnl012 


1162193 


1160421 


R 


CPnlOX3 


1162245 


1163624 


p 


CPnl014 


116S426 


1163732 


R 


CPnlOlS 


1165634 


1166893 




CPnlOlS 


1167042 


1168898 




CPalOI7 


1169006 


1169935 


F 


CPnlOia 


1169898 


1170629 


F 


CPnl019 


1172128 


1170639 


D 

t\ 


CPnl020 


1173679 


1172150 


R 


CPnl021 


1174213 


1173698 


R 


CPnI022 


1175673 


1174216 


R 


CPra023 


1176035 


1176331 


F 


CPnl024 


1177236 


1176334 


R 


CPX11025 


1177302 


1178879 


F 


CPia026 


1178997 


1179X37 


F 


CPnl027 


1179175 


1180755 


F 


CPnI028 


1181016 


1181999 


F 


CPXU029 


1182008 


1182844 


f' 


CPnI030 


1183886 


1182843 


R 


CPnl031 


1185552 


1184098 


R 


CPnI032 


1166150 


1185566 


R 


CPrU033 


1187500 


1186187 


R 


CPnl034 


1188517 


1187732 


R 


CPnl035 


1190000 


1188570 


R 


CPnI03 6 


119X135 


1189964 


R 


CPra037 


1192199 


1191123 


R 


CPnl038 


1192726 


1192199 


R 


CPal039 


1193999 


1192665 


R 


CPnl040 


1194741 


1194073 


R 


CPnioai 


XX95994 


1194726 


R 


CPra042 


JL196590 


1195934 


R 


CPnl043 


XX97717 


1X96572 • 


R 


CPnX044 


1198691 


XX97699 


R 


CPnl045 


1199590 


X198901 


R 


CPnl046 


1200675 


1199590 


R 


CPnl047 


1200552 


1201343 


F 


CPnI048 


X201606 


1202604 


F 


CPnl049 


1202595 


1203914 


F 


CPnIOSO 


1203926 


1204798 


F 


CPnlOSl 


1204962 


1205270 


F 


CPI11052 


1205417 


1206169 


F 


CPnI0S3 


1206153 


1206701 


F 


CPnl054 


1207034 


1209466 


F 


CPnlOSS 


1209694 


1210521 


F 


CPnlOS6 


1210527 


1211228 


F 


CPnl057 


1211497 


1213596 


F 


CPnlOSS 


1213748 


1214836 


F 


CPnl059 


1214848 


1215678 


F 


CPnl060 


1217658 


1215727 


R 


CPnlOSl 


1217920 


1217666 


R 


CPnl062 


1219820 


1218159 


R 


CPnl063 


1219951 


1220712 


F 



rl20-L20 Ribosomal Protein- (CT83 5 1 

pheS-Phenyl4lanyl tRKA Synthetase, Alpha- (CTa36) 

CTS37 hypothetical protein 

CT83a hypothetical protein 

CT83 9 hypothetical protein 

mesJ-pp-loop superfamily ATPase- (CT840) 

f tsH-ATP'dependent zinc protease- (CT841) 

pnp- Polyribonucleotide Nucleotidyltransferase- {CT842) 

rslS-S15 Ribosomal Protein- (CT843 1 

yfhC-cytosine deaminase- {CT844 ) 

CT845 hypothetical protein 

CT846 hypothetical protein 

CTa47 hypothetical protein 

CT848 hypothetical protein 

CT849 hypothetical protein 

CT84 9,1 hypothetical protein 

CT8S0 hypothetical protein 

map-Methionine Aminopeptidase- (CT8S1} 

CT852 hypothetical protein 

CT853 hypothetical protein 

yzeB-ABC transporter permease- (CT854 ) 

CumC-Fumarate Kydratase- (CT855) 

ychM-Suliate Transporter- (CT85 6) 

CT857 hypothetical protein (possible IM protein) 

CT85a hypothetical protein 

lytB-Metalloprotease- (CT659} 

CT360 hypothetical protein 
CT861 hypothetical protein 
lcrH_2-U>w Calcium Response_2- (CT862) 
CT863 hypothetical protein 

xerO-Integrrase/recombinase- tCT864 ) 
pgi-Glucose-6-P Isomerase- (CT378) 
ltuA-{CT377) 

mdhC-Malate Dehyrogenase- (CT376) 

predicted D-amino acid dehyrogenase- tCT375) 
arcD-Arginine /Ornithine Antiporter- (CT374 ) 
CT373 hypothetical protein 
CT372 hypothetical protein 

Predicted 0MP_1 (CT371J (leader (18) peptide) 
Aro£-ShiJcimate 5 -Dehyrogenase- (CT370) 
AroB-Dehyroquinate Synthase- {CT369) 
AroC-Chorismate Synthase- (CT3 68) 
aroL-Shi]cimate Kinase II-(CT367} 
aroA-Phosphoshi)cimate Vinyl transferase- (CT366) 

*'bioA-Adenosylfflethxonine-8-Amino-7-Oxononanoate Aminotransferase 
*bioD-dethiobiotin synthetase 
bioF.2-Oxononanoate Synthase_2 
*bioB-Biotin Synthase 

'conserved hypothetical bacterial membrane protein 
•Tryptophan Kyroxylase 

dapa-Dihydrodipicolinate Reductase- (CT364) 
asd-Aspartace Dehydrogenase- (CT363) 
lysC-Aspartokinase III-{CT362) 
dapA-Dihydrodipicolinate Synthase- (CT361) 



CT356 hypothetical protein 
CT355 hypothetical protein 
kgsA-Dimethyladenosine Transferase- (CT354) 
dxs/t)ct -Trans )cetolase-{CT331) 
CT330 hypothetical protein 
xseA-Exodoxyribonuclease Vii- tcT329) 
tpiS-Triosephosphate Isomeraae- ICT328) 



51 



wo 00/27994 



PCT/US99/26923 



CPnl064 


1220719 


1220895 


r 


CPnl065 


1221095 


122092B 


R 


CPnl066 


1221135 


1221488 


r 


CPnl067 


1221735 


1222292 


F 


CPnl068 


1223258 


1222365 


R 


CPnl069 


1223S13 


1223941 


F 


CI»nl070 


1225511 


1224144 


R 


CPnl07i 


1227324 


1225885 


R 


CPnl072 


1227969 


1228835 


F 


CPnl073 


1223011 


1229832 


F 



def-Polypepcide Def orrnylase- (CT353 ) 
t7ihB_2-Ribonucleaae HII_2- (CTOOB ) 
yfgA-HTH Transcriptional Regulacor- (CT009) 



Predicced 0MP_2 -(CT371) 
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Table 2 (Suppiemcntal Data) Functional Assignments of C pneumoniae Coding Sequences. C trachomatis genes arc shown m 
parentheses. 



Amino Acid Biosynthesis 



10 



15 



20 



25 



30 



35 



40 



Aromatic Family 



1039 


(CT366) 


aroA 


Phosphoshikimaie Vinyliransterase 


1036 


(CT369) 


aroB 


Dchyroquinaie Synthase 


1037 


(CT368) 


aroC 


Chorismaie Synthase 


i035 


{CT370) 


aroE 


Shikimate 5-Dchyrogenasc 


0484 


{CT382) 


aroG 


Dco.\yhcplonate Aldolase 


1038 


(CT367) 


aroL 


Shikimate Kinase H 


0740 


(CT637) 


tyrB 


Aror:\adc AA Aminonansterase 


tpartate Family (lysine) 




1048 


(CT363) 


asd 


Aspz'taie Dehydrogenase 


1050 


{CT36I) 


dapA 


Dihyctrodipicolinaic Synthase 


1047 


{CT364) 


dapB 


Dihy.lrodipicolinatc Reductase 


0519 


(CT430) 


dapF 


Diart inopimciate Epimcrasc 


1049 


(CT362) 


lysC 


Aspa tokinase HI 



Serine Family 
0433 (CT282) 
0521 (CT432) 



gcsH 
glyA 



Glyc ne Cleavage System H Protein 
Serine Hydroxy me thy I transferase 



Base de Nucleotide Metabolism 



0171 




guaA 


GMP Synthase 


0172 




guaB 


Inosine S'-Monophosphasc Dehydrogenase 


0608 






Uridine 5*-Monopho5phate Synthase 


0735 






Uridine Kinase 


0244 


(CT128) 


adk 


Adenylate Kinase 


0894 


(CT751) 


amn 


AMP Nucleosidase 


0568 


(CT452) 


cnik 


CMP Kinase 


0392 


(CT039) 


dcd 


dCTP Deaminase 


0059 


{CT292) 


dut 


dUTP Nuclcotidohydrolase 


0120 


(CT030) 


gmk 


GMP Kinase 


0619 


(CT500) 


ndk 


Nuclcoside-2-P Kinase 


0984 


(CT827) 


nrdA 


Rtbonucleoside Reductase. Large Chain 


0985 


(CT828) 


nrdB 


Ribonuclcoside Reductase, Smalt Chain 


0236 


(CT183) 


pyrC 


CTP Synthetase 


0698 


(CT678) 


pyrH 


UMP Kinase 


0273 


(CT188) 


tdk 


Thymidylate Kinase 


0659 


(CT539) 


trxA 


"ntioredoxin 


0314 


(CT099) 


trxB 


Thiorcdoxin Reductase 


toot 


(CT844) 


yfhC 


Cyiosine Deaminase 



45 Biotin. Lipoate dc Ubiquinone 



50 



Biosynthesis of Cofactors 



1041 




bioA 


Adenosylmeihioninc.8-Amino-7-Oxononanoate Aminotransferase 


1044 




bioB 


Biotin Synthase 


1042 




bioD 


Dethiobiotin Synthetase 


0923 


(CT777) 


bioF_l 


Oxononanoate Synihase_l 


1043 


(CT777) 


bioFJ 


Oxononanoatc Synthasc_2 


0866 


(CT725) 


birA 


Biotin Synthetase 


0748 


(CT628) 


ispA 


Geranyl Transtr^nsferase 


0832 


(CTS58) 


tipA 


Lipoate Synthetase 
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10 



15 



20 



25 



0265 (CT219) 

0264 (CT220) 

0515 (CT428) 

Folic Acid 

0759 (CT6I2) 

0335 (CT078) 

0758 (CT613) 

0757 (CT614) 

0763 (CT649) 

Porphyrin 

0714 {CT662) 

0744 (CT633) 

0052 (CT299) 

0890 (CT747) 

0888 (CT745) 
0138 (CT2I0) 
0380 (CT052) 

0889 (CT746) 
0603 {CT485) 

Riboflavin 

0872 (CT731) 
0532 (CT40S) 
0871 {CT730) 

0873 (CT732) 
0320 (CT093) 



ubiA Benzoatc Ociaphcnylcransterasc 

ubiD Phcnylacrylatc Decart>o;tylasc 

ubiE Ubiquinone M ethyl trans t erase 

foIA Dihydrofolatc Reductase 

folD Methylene Tctrahydro folate Dehydrogenase 

folP Dihydropteroate Synthase 

foIX Dihydroneopterin Aldolase 

ygt'A Fonmy I terrahydro folate Cycloligasc 

hcmA Glutamyl iRNA Reductase 

hemB Porphobilinogen Synthase 

hemC Porphobilinogen Deaminase 

hem£ Uroporphyrinogen Decarboxylase 

hemG proioporphyrinogen Oxidase 

hcmL Glutamatc- 1 -Scmialdchyde-2. t - Aminomuiase 

hemN_l Coproporphyrinogen III Oxidasc^l 

heniN_2 Coproporphyrinogen III Oxidase_2 

hcmZ Ferrochetalasc 

nbA&ribB GTP Cyclohydratase <St DHBP Synthase 

ribC Riboflavin Synthase 

ribD Riboflavin Deaminase 

ribE Ribilyllumaztne Synthase 

ribF FAD Synthase 



Cell Envelope 

Fatty Acid de Phospholipid Metabolism 





0161 


(CT206) 




(predicted acyitransferase family) 


30 


0922 


(CT776) 


aas 


Acylglycerophosphoethanolamine Acyitransferase 




0414 


(CT265) 


accA 


AcCoA Carboxylase/Transferase Alpha 




0183 


(CTt23) 


accB 


Biotin Carboxyl Carrier Protein 




0182 


(CTI24) 


accC 


Biotin Carboxylase 




0058 


(CT293) 


accD 


AcCoA Carboxylase/Transferase Beta 


35 


0295 


{CT236) 


acpP 


Acyl Carrier Protein 




0313 


(CTIOO) 


acpS 


Acyl-carricr Protein Synthase 




0567 


CCT451) 


cdsA 


Phosphatidate Cytidylytransferase 




0297 


(CT238) 


fabD 


Matonyl Acyl Carrier Transcyclase 




0916 


(CT770) 


fabF 


Acyl Carrier Protein Synthase 


40 


0296 


(CT237) 


fabG 


Oxoacyl (Carrier Protein) Rcducuse 




.0298 


(CT239) 


fabH 


Oxoacyl Carrier Protein Synthase HI 




0406 


(CT104) 


fabi 


Enoyl-Acyl-Carricr Protein Reductase 




0651 


(CT532) 


fabZ 


Myristoyl-Acyl Carrier Dehydratase 


45 


0098 


(CTOlO) 


hcrB 


Acyitransferase 


0271 


(CT136) 




Lysophospholipase Esterase 




0615 


(CT496) 


pgsA_I 


Giyccfol-3-P PhosphaudyItransferase_l 




0947 


(CT797) 


pgsA_2 


GIyceroI-3-P Phosphatydyitransferasc_2 




0958 


{CT807) 


pIsB 


Glycerol-3-P Acyitransferase 




0569 


(CT453) 


pIsC 


Glyccrol-3-P Acyitransferase 


50 


0962 


(CT811) 


pIsX 


FA/Phospholipid Synthesis Protein 




0839 


(CT699) 


psdD 


Phosphatidylserine Decarboxylase 




0983 


(CT826) 


pssA 


Glycerol-Serine Phosphatidyl transferase 




0921 


(CT775) 




snGlycerol-3-P Acyitransferase 




0654 


CCT535) 


yciA 


Acyl -Co A Thioestcrasc 


55 


0877 


(CT736) 


ybcL 


CrT736 Hypothetical Protein 



LPS 
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Ot 54 


{CT208) 


gSCA 


KDO Transferase 


072 1 


(CT655) 


KQ5A 




0235 


(CTI82) 


kdsB 


DcoJtyoctulonosic Acid Synthetase 


0650 


(CT531) 


IpxA 


Acyl-Carrier UDP-GlcNAc O-Acyl transferase 


0965 


(CT41I) 


!pxB 


Lipid A Disaccharide Synthase 


0652 


{CTS33) 


IpxC 


Myristoyl GlcNac Deacetyiase 


0302 


(CT243) 


!pxD 


UDP Glucosamine N-Acyluansferase 


Membrane Proteins, Lipoproteins & Porins 


03 !0 


(CT251) 


601 M 


60kDa Inner Membrane Protein 


0556 


{CT442) 




ISkDa Cysteine-Rich Protein 


0653 


(CT534) 


cutE 


A po lipoprotein N -Acetyl transferase 


0311 


{CT252) 




Prolipoprotcin Diacylglyccrol Transferase 


0558 


(CT444) 




9kDa-Cysteine-Rich Lipoprotein 


'0557 


(CT443) 


ImcB 


fiflVDa r*\/«reine«Rich OMP 


0695 


(CT68I) 




MatnrOiitrr Membrane Protein 


0854 


(CT713) 


ompB 


Outer Membrane Protein B 


0781 


(CT600) 


pa! 


Pepridogiycan-Assoctated Lipoprotein 


0300 


{CT241) 


yacT 


OmpSS Homoiog 


Peptidoglycan 






0417 


(CT268) 


amiA 


N-AcetylmuramoyI Alanine Amidase 


0780 


(CT601) 


amiB 


N-Acetylmuramoyl-L-Ala Amidase 


0672 


(CT551) 


dacF 


D-Ala-D*Ala Caroxypepridase 


0968 


(CT816> 


glmS 


Glucosamine- Fructose -6- P Aminotransferase 


0749 


(CT629) 


glmU 


UDP-GlcNAc Pyrophosphorylase 


0900 


(CT757) 


mraY 


Muramoyl-Pentapeptide Transferase 


0571 


(CT455) 


murA 


UDP-N-Acctylglucosaminc Transferase 


0988 


(CT831) 


murB 


UDP-N-Acctylenoipymvoyiglucosamine Reductase 


0905 


(CT762) 


muiC&ddlA Muramate-AIa Ligasc & D-AIa-O-Alam Ligasc 


0901 


(CT758) 


murD 


Muramoylalanine-Glutamate Ligase 


0418 


(CT269) 


murE 


N-Acctylmuramoyialanylgluiamyl DAP Ligase 


0899 


(CT756) 


murF 


Muramoyl-DAP Ligasc 


0904 


(CT761) 


murC 


Peptidoglycan Transferase 


0902 


(CT759) 


nIpD 


Muiamidase (invasin repeat family) 


0694 


(CT682) 


pbp2 


PB P2 -Transg I ycolase/Transpeptidase 


0419 


(CT270) 


pbp3 


Trans glyco lase/Transpep tidase 


0421 


(CT272) 


yabC 


PBP2B Family Methyl transferase 








Cellular Processes 


CeU Division 






0959 


(CT808) 


cafE 


Axial Filament Protein 


osso 


(CT739) 


ftsK 


Cell Division Protein FtsK 


0903 


(CT760) 


OsW 


CeU Division Protein FtsW 


0972 


(CT820) 


ftsY 


Cell Division Protein FtsY 


0617 


(CT498) 


gidA 


FAD-dependent Oxidoreductase 


0805 


(CT582) 


minD 


Chromosome Partitioning ATPase 


0850 


(CT709) 


mrcB 


Rod Shape Protein-Sugar Kinase 


0867 


(CT726) 


rodA 


Rod Shape Protein 


0684 


(CT688) 


parB 


Chromosome Partitioning Protein 


Detoxtification 






0057 


(CT294) 


sodM 


Superoxide Dismutase (Mn) 


0778 


(CT603) 


ahpC 


Thio-specific Antioxidant (TS A) Peroxidase 


Signal Transduction 






0148 


(CT145) 




S/T Protein Kinase 


0584 


(CT467) 


atoS 


Two-Component Sensor 


0294 


(Cn35) 




cAMP- Dependent Protein Kinase Regulatory Subun 


0712 


(CT664) 




(FHA domain) 
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0478 


(CT379) 


hflX 


GTP Binding Protein 


0703 


(CT673) 




SHl Protein Kinase 


0095 


(CT301) 




Sn" Protein Kinase 


0397 


(CT259) 




PP2C Phosphatase Family 


0037 


(CT337) 


ptsH 


PTS Phosphocarrier Protein Hpr 


0038 


(CT336) 


pest 


PTS PEP Phosphotransferase 


0060 


(CT29I) 


ptsN_I 


PTS II A Protcin_l 


0061 


(CT290) 


ptsN 2 


PTS IIA Protein i- HTH DMA-Binding Domain 


0262 


(CT218) 


surE 


SurE-Uke Acid Phosphatase 


0838 


(CT698) 


thdF 


Thiophenc/Furan Oxidation Protein 


0693 


CCT683) 




TPR Rcpeats-CT683 Hypothetical Protein 


0321 


(CT092) 


ychF 


GTP Binding Protein 


0544 


fCT4I8) 


yhbZ 


GTP binding protein 


0844 


(CT703) 


yphC 


GTPase/GTP-binding protein 


landard Protein Secretion 




0115 


(CT02S) 


ffh 


Signal Recognition Particle GTPase 


0363 


{CT060) 


flhA 


Flagellar Secretion Protein 


0858 


(CT717) 


nil 


Flagellunvspecific ATP Synthase 


0704 


(CT672) 


niN 


Flagellar Motor Switch Domain/YscQ family 


0815 


(CT572) 


gspD 


Gen. Secretion Protein D 


0816 


(CT57I) 


gspE 


Gen. Secretion Protein E 


0817 


(CT570) 


gspF 


Gen, Secretion Protein F 


0359 


(CT064) 


IcpA 


GTPa^e 


0110 


(CT020) 


IcpB 


Signal Peptidase I 


0535 


(CT408) 


IspA 


Lipoprotein Signal Peptidase 


0260 


CCT14I) 


sccA_l 


Protein Translocasc Subunit_l 


0841 


(CT701) 


secA_2 


Transtocase SccA_2 


0564 


(CT448) 


sccD&secF Protein Export Proteins SecD/SecF (fusion) 


0075 


(CT32I) 


sccE 


Prcprotcin Translocase 


0629 


CCT5I0) 


sccY 


Translocasc 


0848 


(CT707) 


tig 


Trigger Factor-Peptidyl-proiyl Isomcrase 


ransport'Related Proteins 




0486 






Hypothcrical Proline Permease 


0289 


CCT230) 


aaaT 


Neutral Amino Acid (Gluiamate) Transporter 


0691 


CCT685) 


abcX 


ABC Transporter ATPase 


1031 


(CT374) 


arcD 


Argininc/Omithine Antiponer 


0482 


(CT381) 


artJ 


Argininc Periplasmic Binding Protein 


0836 


(CT554) 


bmQ 


Amino Acid (Branched) Transport 


0536 


(CT409) 


dagA_l 


D-Ala/Gly PeTTnease_l 


0876 


CCT735) 


dagA_2 


D-Aianine/Glycine PerTneasc_2 


0682 


(CT690) 


dppD 


ABC ATPase Dipcpride Transport 


0683 


(CT689) 


dppF 


ABC ATPase Dipeptidc Transport 


0280 


(CT689) 


dppF 


Dipqjtidc Transporter ATPase 


0785 


{CT596) 


exbB 


Macromolecule Transporter 


0784 


(CT597) 


cxbD 


Biopolymcr Transport Protein 


0604 


(CT486) 


niY 


Gluumine Binding Protein 


0192 


(CT129) 


gInP 


ABC Amino Acid Transporter Permease 


0191 


(CT130) 


glnQ 


ABC Amino Acid Transporter ATPase 


0528 


(CT40I) 


gltT 


Glutamate Symport 


0286 


{CT194) 


mgtE 


Mg"*"*" Transporter (CBS Domain) 


0413 


(CT264) 


rmbA 


Transport ATP Binding Protein 


0290 


(CT23i) 




Na"'"-dependent Transporter 


0195 


(CT198) 


oppA_l 


Oligopeptide Binding Protcin^l 


0196 


(CT198) 


oppA_2 


Oligopeptide Binding Protein_2 


0197 


(CT139) 


oppA_3 


Oligopeptide Binding Protein_3 


0198 


(CT175) 


oppA_4 


Oligopeptide Binding Protein 4 
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0599 


(CT480) 


oppA_5 


0199 


fCT199) 


oppBl 


0598 


(CT479) 


oppB_2 


0200 


(CT200) 


oppC^J 


0597 


fCT478) 


oppC_2 


0201 


fCT20l) 


oppD 


0202 


{CT202) 


oppF 


0231 


(CT180) 


tauB 


0782 


fCT599) 


tolB 


0969 


fCT8I7) 


ryrP_l 


0970 


(CT8.18) 


tyrP_2 


0665 


{CT544) 


uhpC 


0282 


(CT216) 


xasA 


0207 


(CT204) 


ybhl 


0971 


fCT819) 


yccA 


0248 


(CTI52) 


ycfV 


1014 


(CT856) 


ychM 


0736 


(CT64I) 


ygcD 


0680 


{CT692) 


ygo4 


0723 


(CT653) 


yhbG 


0023 


(CT348) 


yiiK- 


0127 


(CT034) 


yrfF 


0349 


(CT067) 


ylgA 


0345 


(CT068) 


ylga 


0347 


fCT069) 


ytgc 


0346 


(CT070) 


ytgD 


1012 


(CT854) 


yzcB 


0368 


{CT727) 


cntA 


0279 






0543 


(CT417) 




0692 


(CT684) 




0542 


(CT4i6) 




0690 


(CT686) 




0541 


{CT415) 




Type-il! Secretion 




0323 


(CT090) 


IcrD 


0324 


(CT089) 


IcrE 


0811 


(CT576) 


lcrH_l 


1021 


(CT862) 


lcrH_2 


0325 


(CT088) 


sycE 


0702 


(CT674) 


yscC 


0828 


(CT559) 


yscJ 


0326 


(CT561) 


yscL 


0707 


CCT669) 


yscN 


0825 


(CT562) 


yscR 


0824 


{CT563) 


yscS 


0823 


(CT564) 


yscT 


0322 


(CT091) 


yscU 


Glycogei 


1 Metabolism 


0856 


(CT715) 




0948 


(CT798) 


g!gA , 


0475 


(CT866) 


gigs 


0607 


CCT489) 


gigc 


0307 


(CT248) 


gigp 


0388 


(CT042) 


gigx 



Oligopeptide Binding L:poprotcin_5 
Oligopcpcidc Pcmncase_! 
Oligopeptide Pcrn)easc_2 
OHgopcptidc Permcase_l 
Oligopeptide Pcmiease_2 
Oligopeptide Transport ATPase 
Oligopeptide Transport ATPase 
ABC Tfinspon ATPase rSirratc/Fe) 
Macromoleculc Transporter 
Tyrosine Transport_l 
Tyrosine Transpon_2 
Hexosphosphate Transport 
Amino Acid Transporter 
dicartioxylate Translocator 
Transport Permease 

ABC Transporter ATPase 
Sulfa i: Transporter 

Efflu;-. Protein 

PhosT* late Permease 

ABC f nnsporter ATPase 

ABC Transporter Protein ATPase 

Cariot .ic Amino Acid Transporter 

Solutt Protein Binding Family 

ABC lansporter ATPase 

Integral Membrane Protein 

Intcgial Membrane Protein 

ABC Transporter Permease 

Metal Transport P-type ATPase 

Possible ABC Transporter Permease Protein 

(Metal Transport Protein) 

ABC TrTinsporter 

ABC Transporter ATPase 

ABC Transporter Membrane Protein 

solute binding protein 

Low Calcium Response D 

Low Calcium Response E 

Low Ca Response Protein H_l 

Low Calcium Responsc_2 

Secretion Chapcrone 

Yop C/Gcn Secretion Protein D 

Yop Translocation i 

Yop Translocation L 

Yop N (Flagellar-Typc ATPase) 

Yop Translocation R 

YopS Translocation Protein 

YopT Tranlocation T 

Yop Translocation Protein U 

Central Intermediary Metabolism 



UDP-Glucose Pyrophosphorylasc 
Glycogen Synthase 
Gtucan Branching Enzyme 
Glucose- 1 -P Adenyi transferase 
Glycogen Phosphorylase 
Glycogen Hydrolase (debranching) 
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0326 (CT087) malO 

0851 (CT710) pckA 
Phosphorous <Sc Sulfur 

054S (CT435) cysJ 

0920 (CT774) cysQ 

0025 {CT346) atsA 

0918 (CT772) ppa 



GiucanocransCcrase 
Phosphocnoipyruvatc Cartxaxyidnase 

Sulfite Reductase 

Sulfite Synthcsis/Biphosphate Phosphatase 

Sulphohydroiasc 

Inorganic Pyrophosphatase 



10 



15 



20 



25 



30 



35 



40 



45 



50 



DNA Mismatch Repair 
0505 

0812 (CT575) 

0941 (CT792) 

0402 (CT107) 

0732 {CT625) 

0837 (CT697) 

DNA Modification 

0596 (CT477) 

0114 (CT024) 

0891 (CT748) 

0620 (CT501) 
0390 (CT040) 

0621 (CTS02) 
0053 (CT298) 
0773 (CT607) 
1062 (CT329) 

0AM Recombination 

0762 (CT650) 

0738 (CT639) 

0737 (CT640) 

0123 (CT033) 

0752 (CT652) 

0339 (CT074) 

0340 (CT074) 
0563 (CT447) 
0299 (CT240) 

Dt^A Replication 



DNA Replicarion» Modification, Repair & Recombination 

3-Methyladcnine DNA Glycosylasc 

mutL DNA Mismatch Repair 

mutS DNA Mismatch Repair 

muiY Adenine Glycosylasc 

nfo Endonuclease IV 

nth Enodnuclease 111 

ada Methyl transferase 

hemK A/G-specific Mcthylasc 

mi'd Transcription-Repair Coupling 

ruvA Holliday Junction Heiicasc 

ruvB Holliday Junction Helicase 

ruvC Crossover Junction Endonuclease 

sms Sms Protein 

ung Uracil DNA Glycosylasc 

xscA Exodoxyribonuclcase VII 

rccA RecA Recombination Protein 

rccB Exodeoxyribonuc lease V, Beta 

rccC Exodcoxyribonuc lease V. Gamma 

recD_l Exodcoxyribonuciease V (Alpha Subunit)_l 

rccD_2 Exodcoxyribonuciease V. Alpha _2 

rccF ABC Superfamily ATPasc 

(frame-shift with 0339) 

recJ ssDNA Exonuclcase 
rccR Recombination Protein 



55 



0309 


(CT250) 


dnaA_l 


Replication Initiation Protcin_l 


0424 


(CT275) 


dnaA_2 


Replication Initiation Factor_2 


06)6 


(CT497) 


dnaB 


Replicadvc DNA Helicase 


0666 


(Crr545) 


dnaE 


DNA Pot ni Alpha 


0942 


(CT794) 


dnaG 


DNA Primase 


0338 


(CT075) 


dnaN 


DNA Pol III (Bcu) 


0410 


(CT261) 


dnaQ_l 


DNA Pol III Epsilon Chain^l 


0655 


(CT536) 


dnaQJ 


DNA Pol III Epstion Chain_2 


0040 


(CT334) 


dnaX_l 


DNA Pol III Gamma andTau_l 


0272 


(CT187) 


dnaX,2 


DNA Pol III Gamma and Tau_2 


0149 


(CT146) 


dnU 


DNA Ligasc 


0274 


(CT189) 


gyrA_l 


DNA Cyrasc Subunit A_l 


0716 


(CT660) 


gyrA_2 


DNA Gyrasc Subunit A_2 


0275 


(CT190) 


gyrB.I 


DNA Gyrase Subunit B_I 


0715 


(CT66I) 


15yr9_2 


DNA Gyrasc Subunit B_2 


0416 


(CT267) 


himO 


Integration Host Factor Alpha 


0612 


(CT493) 


polA 


DNA Polymerase I 


0924 


(CT778) 


priA 


Primosomal Protein N* 


0386 


(CT044) 


ssb 


SS DNA Binding Protein 
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0835 (CTS55) 




SWl/SNF family helicasc_l 




0849 (CT708) 




SWUSNfF family hclicasc_2 




0769 (CT643) 


tOpA 


DNA Topotsomcrase I-Fuscd to SWI Domain 




0024 (CT347) 


«rC 


Integrasc/rccombinasc 


5 


1024 CCT364) 


xcrD 


Intcgrasc/recombinasc 




EukaryotiC'Type Chromatin Factors 




0886 (CT743) 


hciA 


Hisconc-Like Devetopmenul Protein 




0384 (CT046) 


hciB 


Histonc-likc Protein 2 




0878 (CT737) 




SET Domain protein 


10 


0577 (CT460) 




SWIB (YM74) Complex Protein 




UVR Exinudease Repair System 






0096 {CT333) 


uvrA 


Excinudcase ABC Subuntt A 




0801 (CTS86) 


uvtB 


Exinuclcasc ABC Subunit B 




0940 (CT791) 


uvtC 


Excinucieasc ABC, Subunit C 


15 


0772 (CT608) 


uvrD 


DNA Heiicase 








Energy Metabolism 




Aerobic 








0855 (CT7I4) 


gp<iA 


GIyceroI-3-P Dehydrogenase 


20 


0743 (CT634) 


nqrA 


Ubiquinone Oxidoredactase. Alpha 




0427 {CT278) 


nqr2 


NADH (Ubiquinone) Dehydrogenase 




0428 (CT279) 


nqr3 


NADH (Ubiquinone) Oxidoreductase, Gamma 




0429 (CT280) 


nqr4 


NADH (Ubiquinone) Reductase 4 




0430 (CT281) 


nqr5 


NADH (Ubiquinone) Reductase 5 


25 


0883 (CT740) 


nqr6 


Phcnolhydrolasc/NADH (Ubiquinone) Oxidoreductase 6 




A TP Biogenesis and metabolism 






0351 (CT065) 


adt_I 


ADP/ATP Translocase_l 




0614 (CT495) 


adi_2 


ADP/ATP Translocase_2 




0088 (CT308) 


atpA 


ATP Synthase Subunit A 


30 


0089 {CT307) 


acpB 






0090 (CT306) 


atpD 


ATP Synthase Subunit D 




0086 (CT3I0) 


acpE 


ATP Synthase Subunit E 




0091 (CT305) 


acpl 


aTP ^vnthase Subunit I 




0092 (CT304) 


atpK 


ATP Synthase Subunit K 


35 


0860 (CT719) 


niF 


Flagellar M-Ring Protein 




Electron Transport Chain 






0102 (CT013) 


cydA 


Cytochrome Oxidase Subunit I 




0103 (CT014) 


cydB 


Cytochrome Oxidase Subuntt !I 




0364 (CT059) 




Fcrrcdoxin 


/I n 


0084 CCT3I2) 




Predicted Ferrcdoxin 




Glycolysis <& Cluconeogenesis 






0281 (CT215) 


dhnA 


Predicted l,o-rructose tJipnospnaic Aiaoiasc 




0800 (CT587) 


eno 


Enolasc 




0624 (CT505) 


gapA 


Glyceraidchyde-3-P Dehyrogcnase 


45 


0056 (CT295) 


mrsA 


Phosphoman nomutase 




0967 (CT815) 


pgm 


P h n <n h n 9 1 ucomu tase 




0160 (CT207) 


pfkA_l 


Fructose-6-P PhosphotransfcTasc_l 




0208 {CT205) 


pfkAJ 


Fnictose-6-P Phosphotransfcrase_2 




1025 (CT378) 


Pgi 


Glucosc-6-P Isomerase 




0679 {CT693) 


Pg*t 


Phosphoglycerate Kinase 




0863 (CT722) 


pgmA 


Phosphoglyccraie Mutase 




0097 (CT332) 


pyk 


Pyruvate Kinase 




1063 (CT328) 


tpiS 


Triosephosphaie Isomerase 




Pentose Phosphate Pathway 




55 


0239 (CT186) 


dcvB 


Glucose-6-P Dehyrogcnase (DcvB family) 




1060 (CT331) 


dxs 


Transkciolasc 
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10 



0360 


fCT063) 


gnd 


6-Phosphogluconaic Dehydrogenase 


0185 


(CT121) 


rpc 


Ribulose-P Epimerasc 


OMl 


{CT213) 


rpiA 


Ribose-5-P Isomerasc A 


0083 


(CT313) 


tal 


Transaldolase 


0893 


(CT750) 


tkta 


Transkctoiasc 


0238 


fCTl85) 


zwf 


Glucosc-6-P Dchyrogcnase 


/ruvate Dehydrogenase 




0833 


(CT557) 


IpdA 


Lipoamide Dehydrogenase 


0436 


(CT2S5) 


lplA_I 


Lipoatc Protein Ligasc-Like Protein 


0618 


fCT499) 


IplA_2 


Lipoate-Protcin Ligasc A 


0033 


(CT340) 


pdhA&B 


Oxoiso valerate Dehydrogenase ct/p Fusion 


0304 


(CT245) 


pdhA 


Pyruvate Dehydrogenase Alpha 


0305 


(CT246) 


pdhB 


Pyruvate Dehydrogenase Beta 


0306 


(CT247) 


pdhC 


Dihydroiipoamide Acetyl transferase 



TCA Cycle 






0495 


(CT390) 


aspC 


Aspartate Aminotransferase 


1013 


(CT855) 


fumC 


Fumaratc Hydratase 


1028 


(CT376) 


mdhC 


Malatc Dehyrogcnasc 


0789 


{CT592) 


sdhA 


Succinate Dehydrogenase 


0790 


(CT591) 


sdhB 


Succinate Dehydrogenase 


0788 


(CT593) 


sdhC 


Succinate Dehydrogenase 


0378 


{CT054) 


sue A 


Oxoglutarate Dehydrogenase 


0377 


(CT055) 


sucB_l 


Dihydroiipoamide Succinyltiansfcrase_l 


0527 


(CT400) 


sucB_2 


Dihydroiipoamide Succinyltransferasc_2 


0973 


(CT82I) 


sucC 


Succinyl-CoA Synthetase, Beta 


0974 


(CT822) 


sucD 


Succinyl-CoA Synthetase, Alpha 



Protein Folding, Assembly & Modincation 





Chaperones 






30 


0949 


(CT799) 


etc 


General Stress Protein 




0534 


(CT407) 


dksA 


DnaK Suppressor 




0032 


(CT341) 


dnaJ 


Heat Shock Protein J 




0503 


{CT396) 


dnaK 


Hsp-70 




0134 


(cxno) 


groEL_l 


Hsp-60_1 


35 


0777 


(CT604) 


groEL_2 


Hsp-60_2 




0898 


(CT755) 


groEL_3 


Hsp-60_3 




0135 


(CTUl) 


groES 


lOKDa Chaperonin 




0502 


(CT395) 


grpE 


HSP-70 Cofactor 




0661 


(CT541) 


mip 


FKBP-type Peptidyl-prolyl Cis-Trans isomerasc 


40 


Proteases 










0144 


(CT113) 


ctpB 


CIp Protease ATPase 




0437 


(CT286) 


clpC 


ClpC Protease 




0520 


(CT431) 


clpP_l 


CLP Protease 




0847 


(CT706) 


cIpP.2 


CLP Protease Subujiit 


45 


0846 


(CT705) 


clpX 


CLP Protease ATPase 




0269 


(0X138) 




Dtpeptidase 




0998 


(CT841) 


ftsH 


ATP-dcpcndent Zinc Protease 




0030 


(CT343) 


gcp_l 


O-Sialogiycoprotein Endopeptidasc^l 




0194 


(CTI97) 




0-Sialoglycoprotein Endopcptidase_2 


50 


0979 


(CT823) 


htrA 


DO Serine Protease 




0957 


{CT806) 


ide 


Insulinase family/Protease III 




0027 


(CT344) 


ton 


Lon ATP-depcndeni Protease 




1017 


(CT859) 


lyiB 


Metalloprotease 




1009 


(CT851) 


map 


Methionine Aminopeptidase 


55 


0385 


(CT045) 


pepA 


Leucyl Aminopeptidase A 




0136 


(CT112) 


pepF 


Oiigopcptidase 
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08 13 


fCT574) 


pcpP 


Aminopcptidasc P 


0613 


fCT494) 


sohB 


Protease 


0555 


(CT441) 


isp 


Tail-Sp€ctfic Protease 


0344 


(CT072) 


yscL 


Mctalloproicasc 


0981 


fCT824)- 




Zinc Metailoproccase (insulinase family) 


Protein Isomerases 






0227 


fCTl76) 


dsbB 


Disulfide bond Oxidoreductase 


0786 


rCT595) 


dsbD 


Thioidisuifidc Interchange Protein 


0228 


{CTI77) 


dsbG 


Disulfide Bond Chapcrone 


0933 


(CT783) 




Predicted Disulfide Bond Isomerase 


0926 


fCT780) 




Thioredoxin Disulfide Isomerase 
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Transcriprion 



RNA Degradation 





0999 


(CT842) 


pnp 


Polyribonuclcoiidc Nuclcoridyltransierasc 


5 


0054 


(CT297) 


mc 


Ribonuclease ill 




0119 


CCT029) 


mhB_l 


Ribonuclcasc Hn_l 




1068 


(CT008) 


mKB.2 


Ribonuclease HII_2 




0934 


(CT784) 


mpA 


Ribonuclease P Protein Component 




0504 


(CT397) 


vacB 


Ribonuclease Family 


10 


RNA Elongation dc Termination Factors 




0741 


CCT636) 


grcA 


iranscnpnon ciungauon racior 




0316 


(CT097) 




N Utilization Protein A 




0076 


(CT320) 


nusG 


Transcripdonat Antitermination 




0845 


(CT704) 


pcnB 1 


Poly A Polymerase 1 


15 


0966 


(CT4I0) 


pcnB 2 


PolyA Polymerase 2 




0610 


(CT491) 




TranscHption Termination Factor 




RNA Methylases 








0674 


(CT553) 


fmu 


RNA Methyl transferase 




1059 


(CT354) 


kgsA 


Dimethytadenosine Transferase 


20 


0187 


(CT133) 




Predicted Methylase 




0530 


(CT403) 




rRMA M^thvlac^ 1 
rrvi>/\ ivicuiyiMC i 




0660 


(CT540) 


spoU_2 


rRNA Methylase 2 




0117 


(CT027) 


OrmD 


tRNA (Guanine N*l)-Mcihyltransfcrase 




0385 


(CT742) 


ygcA 


rRNA M ethyl trans ferse 


25 


0986 


(CT829) 


yggH 


Predicted rRNA Methvlasc 




0987 


(CT830) 


ytgB 


Predicted rRNA Methylase 




RNA Modification 








0649 


CCT530) 


fmt 


Meihionyl tRNA FomiylD^nsferase 




0910 


(CT766) 


miaA 


u\yif\ r yrupnuapndic i ntiotciasc 


30 


0719 


(CT658) 


sfhB 


r^rcuicicu rscuuuunuinc jyiiuiAdc 




0219 


(CT193) 


tgt 


Queuine tRNA Ribosyl Transferase 




0580 


(CT463) 


cniA 


Pseudouridylate Synthase I 




0319 


(CT094) 


tniB 


tRNA Pseudouridine Synthase 




0403 


(CT106) 


yccC 


Predicted Pseudouridine Synthetase Family 


35 


0864 


(CT723) 


yjbc 


Predicted Pseudouridine Synthase 




RNA Polymerase de Transcription 


Regulators 




0586 


(CT468) 


atoC 


Two-Component Regulator 




0362 


(CT061) 


rpsD 


iigma-iij/ wnivj ramiiy 




0501 


(CT394) 


hrcA 


HTH Transcriptional Repressor 


40 


0793 


(CT588) 


rbsU 


Sigma Regulatory Family Protein — PP2C Phosphatase (RsbW Antagonist) 




0626 


(CT507) 


rpoA 


[y^A Polymerase Alpha 




0081 


(CT31S) 


rpoB 






0082 


(CT3I4) 


rpoC 


RNA Polymerase Beu' 




0756 


(CT6i5) 


rpoD 


RNA Polymerase Stgma-66 


45 


0771 


(CT609) 


rpoN 


RNA Polymerase Sigma-54 




0511 


{CT424) 


rebV_l 


Sigma Regulatory Factor_l 




0909 


(CT765) 


rsbV_2 


Sigma Factor ReguIaior_2 




0670 


(CT549) 


RbW 


Sigma Regulatory Factor-Histidinc Kinase 




0750 


(CT630) 


tctD 


HTH Transcriptional Regulatory Protein Receiver Doman 


50 


1069 


(CT009) 


yfgA 


HTH Transcriptional Regulator 










Translation 




Amino Acyl (RNA Synthesis 






0892 


(CT749) 


alaS 


Alanyl tRNA Synthetase 


55 


0570 


(CT454) 


argS 


Arginyi iRNA Transferase 




0662 


(CTS42) 


asps 


Aspartyl tRNA Synthetase 
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0932 


fCT782) 


cysS 


Cysteinyl tRNA Synthetase 


0003 


(CT003) 


gatA 


Glu iRNA Gin Amidotransfcrasc (A subunii) 


0004 


(CT004) 


gatB 


Glu iRNA Gin Amidotransfcrasc (B Subunii) 


0002 


(CT002 ) 


gate 


Glu tRNA Gin Amidotransfcrasc (C subunit) 


0560 


(CT445) 


gltX 


Glutamyi-tRNA Synthetase 


0946 


(CT796) 


gfyQ 


Glycyl tRNA Synthetase 


0663 


(CT543) 


hisS 


Histidyl tRNA Synthetase 


0109 


(CT019) 


ileS 


Uoleucyl-tRNA Synthetase 


0153 


{CT209) 


IcuS 


Leucyi iRNA Synthetase 


0931 


CCT78 1 ) 


lysS 


Lysyl IRNA Synthetase 


0132 


(CT032) 


mciG 


Melhionyi-tRNA Synthetase 


0993 


(CT836) 


pheS 


Phcnylalanyl tRNA Synthetase, Alpha 


0594 


(CT475) 


phcT 


Phenylalanyl tRNA Synthetase Beta 


0500 


(CT393) 


proS 


Prolyl tRNA Synthetase 


0870 


(CT729) 


scrS 


Scryl iRNA Synthctase_2 


0806 


(CT58I) 


OirS 


Thrconyl tRNA Synthetase 


0802 


(CTS85) 


opS 


Tryptophanyl tRNA Synthetase 


036i 


(CT062) 


tyrS 


Tyrosyl tRNA Synthetase 


0094 


CCT302) 


vaiS 


Valyl tRNA Synthetase 



Peptide Chain Initiation. Elongation & Termination 



1067 


(CT353) 


dcf 


Polypeptide Deformylase 


0184 


(CT122) 


cfipj 


Elongauon Factor P_l 


0895 


(CT752) 


cfp_2 


Elongation Factor P_2 


0550 


(CT437) 


fusA 


Elongation Factor G 


0073 


(CT323) 


infA 


Initiation Factor IF-1 


0317 


(CT096) 


infB 


tnitiation Factor-2 


0990 


(CT833) 


infC 


tnitiaiion Factor 3 


0113 


(CT023) 


pfrA 


Peptide Chain Releasing Factor 


0576 


(CT459) 


prto 


Peptide Chain Release Factor 2 


0950 


(CT800) 


pen 


rcpnayi ikjna rtyaroiasc 


0318 


(CT09S) 


roi A 


Ribosome Binding Factor A 


0699 


(CT677) 


rrf 


Ribosome Releasing Factor 


0697 


(CT679) 


tsf 


Elongation Factor TS 


0074 


(CT322) 


lufA 


Elongation Factor Tu 


Ribosomal Proteins 






0078 


(CT318) 


rll 


LI Ribosomal Protein 


0644 


(CT525) 


rl2 


L2 Ribosomal Protein 


0647 


(CT528) 


rl3 


L3 Ribosomal Protein 


0646 


(CT527) 


rl4 


L4 Ribosomal Protein 


0635 


CCT516) 


rI5 


L5 Ribosomal Protein 


0633 


(CT514) 


ri6 


L6 Ribosomal Protein 


0080 


(CT316) 


rI7 


L7/LI2 Ribosomal Protein 


0953 


(CT803) 


rl9 


L9 Ribosomal Protein 


0079 


(CT3I7) 


rllO 


LIO Ribosomal Protein 


0077 


(CT319) 


rlll 


LI I Ribosomal Protein 


0247 


(CTUS) 


rl13 


LI 3 Ribosomal Protein 


0637 


(CTSI8) 


rlI4 


LI 4 Ribosomal Protein 


0630 


(CTSII) 


rll 5 


LIS Ribosomal Protein 


0640 


(CT521) 


ril6 


LI 6 Ribosomal Protein 


0625 


(CT506) 


rll7 


L17 Ribosomal Protein 


0632 


(CT513) 


rll8 


LIS Ribosomal Protein 


0118 


(CT023) 


riI9 


L19 Ribosomal Protein 


0992 


(CT835) 


rl20 


L20 Ribosomal Protein 


0546 


(CT420) 


rl21 


L21 Ribosomal Protein 


0642 


(CT523) 


rI22 


L22 Ribosomal Protein 


0645 


(CT526) 


rI23 


L23 Ribosomal Protein 
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0636 


(CT517) 


rI24 


L24 Ribosomal Protein 


0545 


(CT4I9) 


rl27 


L27 ribosomal protein 


0327 


(CT086) 


rl28 


L28 Ribosomal Protein 


0639 


(CT520) 


rl29 


L29 Ribosomal Protein 


0112 


(CT022) 


rlJl 


L31 Ribosomal Protein 


0961 


(CT810) 


rl32 


L32 Ribosomal Protein 


0250 


(CT150) 


ri33 


L33 Ribosomal Protein 


0935 


CCT785) 


rl34 


L34 Ribosomal Protein 


0991 


(CT834) 


rl35 


L3S Ribosomal Protein 


0936 


(CT786) 


ri36 


L36 Ribosomal Protein 


0315 


(CT098) 


rsl 


SI Ribosomal Protein 


0696 


(CT680) 


rs2 


S2 Ribosomal Protein 


0641 


(CT522) 


rs3 


S3 Ribosomal Protein 


0733 


(CT626) 


rs4 


S4 Ribosomal Protein 


0631 


(CT512) 


rs5 


55 Ribosomal Protein 


0951 


(CT801) 


rs6 


S6 Ribosomal Protein 


055 1 


(CT438) 


rs7 


S7 Ribosomal Protein 


0634 


(CIS 15) 


reS 


S8 Ribosomal Protein 


0246 


(CT126) 


re9 


S9 Ribosomal Protein 


0549 


(CT436) 


rslO 


SlO Ribosomal Protein 


0627 


(CT508) 


rel 1 


SI 1 Ribosomal Protein 


0552 


(CT439) 


r3i2 


SI 2 Ribosomal Protein 


0628 


(CT509) 


rsl3 


S13 Ribosomal Protein 


0937 


{CT787) 


rsl4 


SI 4 Ribosomal Protein 


1000 


CCT843) 


rsl5 


SI 5 Ribosomal Protein 


01 16 


(CT026) 


rsl6 


SI 6 Ribosomal Protein 


0638 


(CT519) 


rsl7 


SI 7 Ribosomal Protein 


0952 


(CT802) 


rsl8 


SI 8 Ribosomal Protein 


0643 


(CT524) 


rsl9 


SI 9 Ribosomal Protein 


0754 


(CT617) 


rs20 


S20 Ribosomal Pmtein 


003 1 


{CT342) 


rs21 


S21 Ribosomal Protein 








Other Categories 


Chlamydia -Specific 


Proteins 




0561 


{CT446) 


Euo 


CHLPS Euo Protein 


0804 


(CT583) 


Gp6D 


CHLTR Plasmid Paralog 


0186 


(CT119) 




Similarity to IncA_l 


0291 


(CT232) 


incB 


Inclusion Membrane Protein B 


0292 


(CT233) 


incL. 


[ncliuion Membrane Protein C 


1026 


(07377) 




LtuA Protein 


0333 


CCT080) 




LtuB Protein 


0005 


{CT871) 


pnip_l 


Polymorphic Outer Membrane Protein G Family 


0013 


(CT87!) 




Polymorphic Outer Membrane Protein G Family 


0014 


(CT871) 


prnp_3 


Polymorphic Outer Membrane Protein G Family 


0015 


(CT871) 


pnip_3 


rtvir_j (rrame-snitc witn 0014} 


0016 


(CT874) 


pmp.4 


Polymorphic Outer Membrane Protein G Family 


0017 


(CT871) 


pmp_ 


rtvir^'* ^iramC'Sniii witn uuio} 


0018 


(CT874) 




ruiymorptiic uuier ivicmDrane rroicin vj ramiiy 


0019 


(CT871) 




PMP c /Tnm^.chi'n with nnifi\ 


0444 


(CT871) 


Pt3'6 


Polymorphic Outer Membrane Protein G/I Family 


0445 


CCT871) 


pmp_/ 


Polymorphic Outer Membrane Protein G Family 


0446 


(CT871) 


pmp_8 


Polymorphic Outer Mcmbriine Protein G Family 


0447 


(CT871) 


pmp_9 


Polymorphic Outer Membrane Protein G/I Family 


0450 


(CT871) 


pmp^lO 


Polymorphic Outer Membrane Protein G Family 


0449 


(CT871) 


pmp_10 


PMP_10 (Framc-shift with 0450) 
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10 



15 



20 



0451 


^CT37! ) 


pmp_ 


1 1 


Polymorphic Outer Membrane Protein G Family 


0452 


fCT874) 


pmp_ 


12 


rOivmorpfuc uuier Mcmotanc rrDicin iirunudicu^ rv'i % amn/ 


0453 


(CT871) 


pmp. 


13 


Polymorphic Outer Membrane Protein G Family 


0454 


fCT872) 


pmp_ 


14 


Dniimn<-nUi'j> Oiit^r Vtl^mWinn^ Pmtrin H FsmiiV 

r OiymOrpniC WUlCr IVlClUUIallC l l\JU.m (i i amuT 


0466 


fCT869) 


pmp_ 


15 


rOlvmorpniC vjuicr ivicmDranc rtwicin t ranitij 


0467 


(CT869) 


pmp_ 


16 


Dniiymni-nKiV Oiif^r M^mhnne Pmtein £ Familv 
rOiymorptHC wuicr ivicniuiiiuc r luv&iii i-> * 


0468 


fCT869) 


pinp_ 


1 7 


Dnliimnmhi^ C^iitmr VfrmhrAnr Prntcitl E Familv 
rOI VmorpniC UUlcr ivictiiuianc riuittii I* » iimii^ 


0469 


(CT869) 


pmp_ 


1 7 


fjvlr 1 t irroiTiC"sinii wiui v^uo/ 


0470 


{CT869) 


pmp. 


J 7 


DKjf D 1 1 /Cnm^.chifi U/tth 0469^ 

fjVir 1 ' ^rramC'Sniii wiui 


0471 


(CT870) 


pmp. 


18 


rOlymorpnlC ^JUier ivicmoranc riuiciii ti/i i amnf 


0539 


(CT412) 


pmp. 


19 


Polymorphic Membrane Protein A Family 


0540 


(CT413) 


pmp. 


20 


Potymorphic Membrane Protein B Family 


0963 


(CT8I2) 


pmp. 


21 


Polymorphic Membrane Protein D Family 


0562 








Cj-Hi-"-^ 4J Kija rroicin riomoioi5_i 


0927 








CHLr J 4j Kua rrotetn riornoios_i 


0928 








(„HL'. o 4 J KJJa rroicin nuinuiog_j 


0929 








CHL 'S 43 kDa Protein Homolog_4 


0728 


(CT622) 






CHL.'N 76JcDa Homolog_l (CT622) 


0729 


(CT623) 






CHLPN 76IcDa Homolog_2 (CT623) 


0133 


(Cn09) 






CHLI'S Hypothetical Protein 


0332 


(CT081) 






CHL'* RT2 Protein 



Miscetlaneotis Enzymes/Conserved Prott ins 





0193 




argR 


Posst Ic Arginine Repressor 




1046 






Aron aric Amino Acid Hydroxylase 


25 


0232 






Similririty to 5*-Methyllhioadenosine Nucleosidase 




0I2S 


fCT035) 




Btotin Protein Ligase 




0513 


(CT426) 




Fe-S Oxidorcduciase_l 




0911 


(CT767) 




Fc-S Oxidorcductase_2 




0373 


{CT057) 


gcpE 


GcpE Protein 


Id 


0407 


(CT103) 




cj A r> Ciirk^f-Aimitv HvHmlace/Phosnhatase 
ri/\U jupciianiiiy nyiiiuiww/t 




0917 


(CT771) 




Hydrolase/Phosphatase Homolog 




0488 


CCT385) 


ycf? 


HIT Family Hydrolase 




0701 


CCT67S) 


karG 


Arginine Kinase 




0526 


(CT399) 


kpsF 


GutQ/KpsF Family Sugar-P Isomerase 


35 


0919 


(CT773) 


Idh 


Leucine Dehydrogenase 




0022 


(CT349) 


maf 


Maf protein 




0997 


(CT840) 


mesJ 


PP-loop superfamiiy ATPase 




0151 


(CT148) 


mhpA 


Monooxygcnase 




0730 


(CT624) 


mviN 


Integral Membrane Protein 


40 


0861 


(CT720) 




NifU-Rclated Protein 




0479 


(CT380) 


phnP 


Metal Dependent Hydrolase 




0106 


(CT015) 


phoH 


ATPase 




0329 


{CT084) 




Phopholipase D Superfamiiy 




0435 


(CT284) 




Phospholipase D Superfamiiy 


45 


0581 


(CT464) 




Phosphogiycolatc Phosphatase 




0897 


(CT754) 




Predicted Phosphohydrolase 




0509 


(CT422) 




Predicted Metalloenzymc 




1030 


(CT375) 




Predicted D-Amino Acid Dehyrogcnasc 




053 1 


(CT404) 




SAM Dependent Methyl transferase 


50 


0337 


(CT076) 


$mpB 


Small Protein B 




0394 


(CT256) 


ilyC_l 


CBS Domain Protein (Hemolysin Homolog)_l 




0510 


(CT423) 


llyC_2 


CBS Domains (Hemolysin Homolog),2 




0382 


(CT048) 


yabC 


SAM-Dependent Mcthy transferase 




0787 


(CT594) 


yabD 


PHP Superfamiiy (Urcase/Pyrimidinasc) Hydrota 


55 


0611 


(CT492) 


yacE 


Predicted Phosphatase/Kinase 




0579 


(CT462) 


yacM 


Sugar Nucleotide Phosphorylase 




0578 


(CT461) 


yaci 


Phosphohydrolase 
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PCT/US99/26923 



10 



15 



20 



25 



30 



0345 


ICT07I) 


yacM 


CT07I Hypothetical Protein 


0566 


fCT450) 


yacS 


YacS family Hypothcricai Protein 


059 1 


(CT472) 


yagE 


YagE family 


0039 


(CT335) 


vbaB 


YbaB family Hypothetical Protein 


0101 


(CT012) 


ybbP 


YbbP family Hypothetical Protein 


0915 


(CT769) 


ybeB 


iojap Supcrfamiiy Ortholog 


0137 


(CT108) 


ybgl 


ACR family 


0529 


(CT402) 


ycaH 


ATPasc 


0438 


(CT2S7) 


ycbF 


PP-loop Supcrfamiiy ATPasc 


0734 


(CT627) 


yccA 


YccA Hypothetical Protein 


0954 


{CT804) 


ychB 


Predicted Kinase 


0261 


(CT2I7) 


ydaO 


PP-Loop Superfamily ATPasc 


0245 


(CT127) 


ydhO 


Polysaccharide Hydroiase-Invasin Repeat Family 


0573 


{CT457) 


yebC 


YebC Family Hypotheticat Protein 


0689 


{CT687) 


y(hO_l 


NifS-related Aminocransfcrase_l 


0862 


tCT72I) 


yfhoj 


NifS-retated AminotTansferasc_2 


0547 


(CT434) 


ygbB 


YgbB Family Hypothetical Protein 


0237 


(CT184) 


yggf 


YggF Family Hypothetical Protein 


0775 


(CT606) 


yggv 


YggV Family Hypothcricai Protein 


0396 


(CT258) 


yhfDJ 


NifS-relaied Aminotransferase_3 


0605 


(CT487) 


yhhf 


Predicted Mcthylase 


0575 


fCT458) 


yhhY 


Amino Group Acetyl Transferase 


0592 


(CT473) 


yidD 


YidD Family 


0982 


(CT825) 


yigN 


YigN Family Hypothetical Protein 


0657 


{CT537) 


yjcE 


YjeE Hypothetical Protein 


0768 


(CT644) 


yohl 


Yohl Predicted Oxidoreduciasc 


0336 


(CT077) 


yojL 


YojL Hypothetical Protein 


0217 


CCT140) 


ypdP 


YpdP Hypothetical Protein 


0140 


(CT212) 


yqdE 


YqdE Hypothetical Protein 


0263 


{CT22I) 


yqOJ 


YqfU Hypothetical Proietn 


0139 


(CT2n) 


yqg£ 


YqgE Hypothetical Protein 


0270 


(CT137) 


ywlC 


SuAS Superfamily*related Protein 


0879 


(CT738) 


yycJ 


Metal Dependent Hydrolase 



35 



Homotogs to CHLTR Hypothetical Coding Cenn 



40 



45 



50 



55 



0001 


(CTOOl) 


CTOOl Hypothetical Protein 


0020 


(CT35I) 


CT35I Hypothetical Protein 


0021 


(CT350) 


CT350 Hypothetical Protein 


0026 


(CT345) 


CT345 Hypothetical Protein 


0035 


(CT339) 


CT339 Hypothcticai Protein 


0036 


(CT338) 


CT338 Hypothetical Protein 


0055 


(CT296) 


CT296 Hypothetical Protein 


0062 


(CT289) 


CT289 Hypothcticai Protein 


0065 


{CT288) 


CT288 Hypothetical Protein 


0068 


(CT360) 


CT360 Hypothetical Protein 


0071 


(CT325) 


CT325 Hypothetical Protein 


0072 


(CT324) 


CT324 Hypothetical Protein 


0085 


(cnu) 


CT3 1 1 Hypothcricai Protein 


0087 


(CT309) 


CT309 Hypothetical Protein 


0093 


(CT303) 


CT303 Hypothcticai Protein 


0100 


(CTOII) 


CTOl 1 Hypothcticai Protein 


0104 


(CT0I7) 


CT0I7 Hypothcticai Protein 


0105 


(CT016) 


CTOl 6 Hypothetical Protein 


0107 


(CT058) 


CT058 Hypothcticai Protcin^l 


0108 


(CT018) 


CTOl 8 Similarity 


0111 


tCT021) 


CT02 1 Hypothetical Protein 


0121 


(CT031) 


CT03 1 Hypothetical Protein 
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0129 


(CT036I 


0145 


(CTl 14) 


0150 


(CTI47) 


0152 


CCTU9) 


0176 


CCT153) 


0188 


{CTi32) 


0189 


(CTI31) 


0206 


(CT203) 


0229 


(CT178) 


0230 


(CTl 79) 


0234 


(CT18I) 


0249 


(CT151) 


0253 


(CT144) 


0254 


(01143) 


0255 


(CT142) 


0256 


{CT144) 


0257 


(CT143) 


0259 


(CT142) 


0276 


(CT191) 


0288 


{CT195) 


0293 


(CT234) 


030! 


(CT242) 


0303 


(CT244) 


0308 


(CT249) 


0312 


(CTl 01) 


0328 


(CT085) 


0330 


(CT083) 


0331 


(CT082) 


0334 


(CT079) 


0342 


(CT073) 


0343 


(CT073) 


0350 


(CT066) 


0369 


(CT058) 


0370 


(CT058) 


0374 


{CT056) 


0379 


(CT053) 


0381 


(CT326) 


0383 


(CT047) 


0387 


(CT043) 


0389 


(CT04I) 


0393 


(CT038) 


0395 


(CT257) 


0399 


(CT253) 


0400 


(CT254) 


0401 


(CT255) 


0405 


(CTIOS) 


0408 


(CT102) 


0409 


(CT260) 


0411 


(CT262) 


0412 


(CT263) 


0415 


(CT266) 


0420 


{CT271) 


0422 


(CT273) 


0423 


(CT274) 


0425 


(CT276) 


0426 


(Cn77) 


0434 


{CT283) 



CT036 Similarity 
CTIU Hypothetical Protein 
cm 47 Hypothetical Protein 
CTl 49 Hypothetical Protein 
CTl 53 Hypothetical Protein 
CT132 Hypothetical Protein 
CTl 3 1 Hypothetical Protein 
CT203 Hypothetical Protein 
CTl 78 Hypothetical Protein 
CTl 79 Hypothetical Protein 
CTl 81 Hypothetical Protein 
CTl 51 Hypothetical Protein 
CTl 44 Hypothetical Protcin_l 
CTl 43 Hypothetical Protcin_l 
CTl 42 Hypolheticat Protcin_l 
CT144 Hypothetical Protein_2 
CT143 Hypothetical Protcin_2 
CTl 42 Hypothetical Proicin_2 
CTl 91 Hypothetical Protein 
CTl 95 Hypothetical Protein 
CT234 Hypothetical Protein 
CT368 Hypothetical Protein 
CT244 Hypothetical Protein 
CT249 Similarity 
CTIOI Hypothetical Protein 
cross Hypothetical Protein 
CT083 Hypothetical Protein 
CT082 Hypothetical Protein 
CT079 Similarity 
CT073 Hypothetical Protein 
(frame-shift with 0342?) 
CT066 Hypothetical Protein 
CT0S8 Hypothetical Protein_2 
CT058 Hypothetical Protcin_3 
CT056 Hypothetical Protein 
CT053 Hypothetical Protein 
CT326 Similarity 
CT047 Hypothetical Protein 
CT043 Hypothetical Protein 
CT041 Hypothetical Protein 
CT038 Hypothetical Protein 
CT257 Hypothetical Protein 
CT253 Hypothetical Protein 
CT254 Hypothetical Protein 
CT2S5 Hypothetical Protein 
CTl 05 Hypothetical Protein 
CTl 02 Hypothetical Protein 
CT260 Hypothetical Protein 
CrT262 Hypothetical Protein 
CT263 Hypothetical Protein 
CT266 Hypothetical Protein 
CT271 Hypothetical Protein 
CT273 Hypothetical Protein 
CT274 Hypothetical Protein 
CT276 Hypothetical Proteins 
CT277 Similarity 
CT283 Hypothetical Protein 
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0441 {CT007) 

0442 {CT0a6) 

0443 (CT005) 
0474 (CT365) 
0476 (CT865) 
0480 (CT383) 
0485 (CT382) 
0487 (CT384) 

0489 (CT386) 

0490 {CT387) 

0491 (CT389) 

0496 (Cn91) 

0497 fcnss) 

0506 (CT421) 

0507 (CT42I) 

0508 (CT421) 
0512 (CT425) 
0514 (CT427) 
0518 (CT429) 
0522 (CT433) 
0525 (CT398) 
0533 (CT406) 

0537 (CT814) 

0538 (CT814) 
0554 {CT440) 
0559 (CT441) 
0565 (CT449) 
0572 {CT456) 

0582 CCT465) 

0583 (CT466) 

0588 (CT469) 

0589 (CT470) 

0590 (CT471) 
0593 (CT474) 
0595 (CT476) 

0601 (CT483) 

0602 (CT484) 
0606 (CT488) 
0609 (CT490) 

0622 (CT503) 

0623 (CT504) 
0648 (CT529) 
0658 (CT538) 

0667 (CTS46) 

0668 (CT547) 

0669 (CT548) 
0671 (CTS50) 
0673 (CT552) 

0675 (CT696) 

0676 (CT695) 
0681 (CT69I) 

0687 (CT482) 

0688 (CT481) 
0700 (CT676) 

0705 {CT671) 

0706 (CT670) 
0708 (CT668) 



CT007 Hypothetical Protein 
CT006 Hypothetical Protein 
CT005 Hypothetical Protein 
CT365 Hypothetical Protein 
CT865 Hypothetical Protein 
CT383 Hypothetical Protein 
CT382.1 Hypothetical Protein 
CT384 Hypothetical Protein 
CT386 Hypothetical Protein 
CT387 Hypothetical Protein 
CT389 Hypothetical Protein 
CT391 Hypothetical Protein 
CT388 Hypothetical Protein 
CT421 Hypothetical Protein 
CT421.1 Hypothetical Protein 
CT421.2 Hypothetical Protein 
CT425 Hypothetical Protein 
CT427 Hypothetical Protein 
CT429 Hypothetical Protein 
CT433 Hypothetical Protein 
CT398 Hypothetical Protein 
CT406 Hypothetical Protein 
CT814.1 Hypoihcrical Protein 
CT8I4 Hypothetical Protein 
CT440 Hypothetical Protein 
CT441.I Hypothetical Protein 
CT449 Hypothetical Protein 
CT456 Hypothetical Protein 
CT46S Hypothetical Protein 
CT466 Hypothetical Protein 
CT469 Hypothetical Protein 
CT470 Hypothetical Protein 
CT471 Hypothetical Protein 
CT474 Hypothetical Protein 
CT476 Hypothetical Protein 
CT483 Hypothetical Protein 
CT484 Hypothetical Protein 
CT488 Hypothetical Protein 
CT490 Hypothetical Protein 
CT503 Hypothetical Protein 
CT504 Hypothetical Protein 
CT529 Hypothetical Protein 
CT538 Hypothetical Protein 
CT546 Hypothetical Protein 
CT547 Hypothetical Protein 
CT548 Hypothetical Protein 
CTS50 Hypothetical Protein 
CT552 Hypothetical Protein 
CT696 Hypothetical Protein 
CT695 Similarity 
CT691 Hypothetical Protein 
CT482 Hypothetical Protein 
CT48 1 Hypothetical Protein 
CT676 Hypothetical Protein 
CT67I Hypothetical Protein 
CT670 Hypothetical Protein 
CT668 Hypothetical Protein 
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0709 


iCT667) 


0710 


iCT666) 


071 1 


(CT665) 


0713 


1CT663) 


0717 


tCT656) 


0713 


(CT657) 


0720 


fCT659) 


0722 


(CT6S4) 


0725 


fCT652) 


0726 


fCT620) 


0727 


(CT6I9) 


0739 


(CT638) 


0742 


{CT635) 


0746 


(CT632) 


0747 


(Cr63l) 


0751 


(CT651) 


0755 


fCT6l6) 


0760 


(CT6I1) 


0761 


(CT610) 


0764 


(CT648) 


0765 


(CT647) 


0766 


(CT646) 


0767 


{CT645) 


0770 


(CT642) 


0774 


(CT606) 


0776 


(CT605) 


0779 


{CT602) 


0783 


(Cr598) 


0791 


(CT590) 


0792 


{CT589) 


0803 


(CT584) 


0807 


(CT530) 


0803 


(CT579) 


0809 


{CT578) 


0810 


(CT577) 


0814 


(CT573) 


0818 


fCT569) 


0819 


(CT568) 


0820 


(CT567) 


0821 


(CT566) 


0822 


(CT565) 


0827 


(CT560) 


0834 


{CT556) 


0840 


(CT700) 


0842 


(CT702) 


0843 


(CT702) 


0852 


(CT7U) 


0853 


(CT712) 


0857 


(CT7I6) 


0859 


(CT718) 


0865 


(CT724) 


0869 


{CT728) 


0874 


{CT733) 


0875 


{CT734) 


0884 


(CT74I) 


0887 


(CT744) 


0896 


(CT753) 



CT667 Hypothcttcai Protein 
CT666 Hypoihcrical Proton 
CT665 Hypotheucal Protein 
CT663 Hypothetical Protein 
CT656 Hypothetical Protein 
CT657 Hypothcbcal Protein 
CT659 Hypothetical Protein 
CT654 Hypothetical Protein 
CT652.1 Hypotherical Protein 



CT620 


Hypothetical 


Protein 


CT619 


Hypotherical 


Protein 


CT368 


Hypothetical 


Protein 


CT63S 


Hypothetical 


Protein 


CT632 


Hypothetical 


Protein 


CT631 


Hypothetical 


Protein 


CT651 


Hypothetical 


Protein 


CT616 


Hypothetical 


Protein 


CT611 


Hypothetiral 


Protein 


CT610 


Hypothctiral 


Protein 


CT648 


Hypotheti :al 


Protein 


CT647 


Hypothctiial 


Protein 


CT646 


Hypothctit at 


Protein 


CT645 


Hypotheti al 


Protein 


CT642 


Hypotheti :al 


Protein 



CT606.1 Hypothetical Protein 
CT605 Hypothetical Protein 
CT602 Hypothetical Protein 
CT598 Hypothetical Protein 
CT590 Hypothetical Protein 
CT589 Hypothetical Protein 
CT584 Hypothetical Protein 
CT580 Hypothetical Protein 
CT579 Hypothetical Protein 
CTS78 Hypothetical Protein 
CT577 Hypothetical Protein 
CT573 Hypothetical Protein 
CT569 Hypothetical Protein 
CT568 Hypothetical Protein 
CT567 Hypothetical Protein 
CT566 Hypothetical Protein 
CT565 Hypothetical Protein 
CT560 Hypothetical Protein 
CT556 Hypothetical Protein 
CT700 Hypothetical Protein 
CT702 Hypothetical Protein 
CT702 Hypothetical Protein 
CT7U Hypothetical Protein 
CT7I2 Hypothetical Protein 
CT716 Hypothetical Protein 
CT7I8 Hypotheticai Protein 
CT724 Hypothetical Protein 
CT728 Hypothetical Protein 
CT733 Hypotheticai Protein 
CT734 Hypothetical Protein 
CT741 Hypothetical Protein 
CHLTR Possible Phosphoprote 
CT753 Hypothetical Protein 
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0906 


{CT763 ) 


CT763 Hypoihccicsl Protein 


0908 


{CT764) 


CT764 Hypothetical Protein 


0912 


(CT768) 


CT768 Hypothetical Protein 


0925 


fCT779) 


CT779 Hypothetical Protein 


0938 


(CT788) 


CT788 Hypothetical Protein 


0939 


(CT790) 


CT790 Hypothetical Protein 


0943 


(CT794) 


CT794.1 Hypothetical Protein 


0945 


(CT795) 


CT79S Hypothetical Protein 


0956 


(CT805) 


CT305 Hypothetical Protein 


0960 


CCT809) 


CT809 Hypothetical Protein 


0989 


{CT832) 


CT832 Hypothetical Protein 


0994 


(CT837) 


CT837 Hypothetical Protein 


0995 


(CT838) 


CT838 Hypothetical Protein 


0996 


(CT839) 


CT839 Hypothetical Protein 


1002 


(CT845) 


CT845 Hypothetical Protein 


1003 


(CT846) 


CT846 Hypothetical Protein 


1004 


(CT847) 


CT847 Hypothetical Protein 


1005 


(CT848) 


CT848 Hypothetical Protein 


1006 


(CT849) 


CT849 Hypothetical Protein 


1007 


fCT849) 


CT849.I Hypothetical Protein 


1008 


(CT850) 


CT350 Hypothetical Protein 


lOlO 


(CT852) 


CT852 Hypothetical Protein 


toil 


CCT853) 


CT853 Hypothetical Protein 


1015 


(CT857) 


f~TSS7 Hvnnrhrfiral Protein 


1016 


(CT858) 


CT858 Hypothetical Protein 


1019 


{CT860) 


CT860 Hypothetical Protein 


1020 


(CT861 ) 




1022 


{CT863) 


CT863 Hypothetical Protein 


1032 


(CT373) 


CT373 Hypothcticat Protein 


1033 


(CT372) 


CT372 Hypothetical Protein 


1034 


(CT37I ) 


^ 1 J / 1 nypomcucai rrDicin 


1057 


(CT356) 


CT356 Hypothetical Protein 


1058 


(CT355) 


CT355 Hypothetical Protein 


1061 


(CT330) 


CT330 Hypothetical Protein 


1073 


(CT371) 


CT37I Hypothetical Protein 



Coding Genes Not in C. trachomatis 
0486 Hypothetical Proline Permease 

0279 Possible ABC Transporter Permease Protein 

0505 3-Mcthyladenine DNA Glycosylasc 

0193 argR Similarity to A rginine Repressor 

1041 bioA Adcnosy Imethionine -8 -A mino-7-Oxononanoate Aminotransferase 

1044 bioB Biotin Synthase 

1042 bioD Dethiobiotin synthetase 
0585 Similarity to Cps IncA_2 

0562 CHLPS 43 kDa Protein Homolog^I 

0927 CHLPS 43 kDa Protein Homologj 

0928 CHLPS 43 k0a Protein Homolog_3 

0929 CHLPS 43 kDa Protein Homo!og_4 

1045 Conserved Hypothetical Membrane Protein 
0251 Conserved Hypothetical Protein 

0278 Conserved Outer Membrane Lipoprotein Protein 

0907 CutA*like Periplasmic Divalent Cation Tolerance Protein 

0171 guaA GMP Synthase 

0172 guaB Inosine 5'-Monophosphase Dehydrogenase 
0608 Uridine 5 '-Monophosphate Synthase 
0735 Uridine Kinase 

70 



wo 00/27994 



PCT/US99/26923 



0980 Similar 10 Saccharomyces cerevisiae 5 2.9 K Da Protein 

0232 Similarity to 5'-McihyIihioadcnosine Nucleosidase 

1046 Tryptophan Hydroxylase 

0477 yqeV_3s Conserved Hypothetical Protein 

0048 yqfF-Bs Conserved Hypothetical IM Protein 

0587 yvyD_Bs Conserved Hypothetical Protein 

0143 ytjG_Bs_l Conserved Hypothetical Protein 

0448 y^iG_Bs_2 Conserved Hypothetical Protein 



0006 


0180 


0440 


0977 


0007 


0181 


0455 


0978 


00O8 


0190 


0456 


1018 


0009 


0203 


0457 


1023 


0010 


0204 


0458 


1027 


0011 


0205 


0459 


1029 


0012 


0209 


0460 


1040 


0028 


0210 


0461 


1051 


0029 


0211 


0462 


1052 


0034 


0212 


0463 


1053 


0041 


0213 


0464 


1054 


0042 


0214 


0465 


1055 


0043 


0215 


0472 


1056 


0044 


0216 


0473 


1064 


0045 


0218 


0481 


1065 


0046 


0220 


0483 


1066 


0047 


0221 


0492 


1070 


0049 


0222 


0493 


1071 


0050 


0223 


0494 


1072 


005 1 


0224 


0498 




0063 


0225 


0499 




0064 


0226 


0516 




0066 


0233 


0517 




0067 


0240 


0523 




0069 


0241 


0524 




0070 


0242 


0553 




0099 


0243 


0574 




0124 


0266 


0600 




0125 


0267 


0656 




0126 


0268 


0664 




0130 


0277 


0677 




0131 


0283 


0678 




0132 


0284 


0685 




0142 


0285 


0686 




0146 


0287 


0724 




0147 


0352 


0731 




0155 


0353 


0745 




0156 


0354 


0753 




0157 


0355 


0794 




0158 


0356 


0795 




0159 


0357 


0796 




0162 


0358 


0797 




0163 


0365 


0798 




0164 


0366 


0799 




0165 


0367 


0829 




0166 


0368 


0830 




0167 


0371 


0831 




01 68 


0372 


0881 




0169 


0375 


0882 
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0170 
0173 
0174 
0175 
0177 
0178 
0179 



0376 
0391 
0398 
0404 
0431 
0432 
0439 



0913 
09U 
0930 
0944 
0964 
0975 
0976 



72 
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CbXwvdia pn«uaanl«« o«aon cncod*. rot«ixu 

cpn_oooi no 4 

CTdfJl hvpT'-n^r. icfll procein 

KRLKDE I mSLn RKAMLCK I r RCt^SL r/: LCAIifl/GL IG :THNKLN I r AKLCCCVSTP 
ATQ tTY 1 1 re I ACT/ rCLlSFCPFC3KK3RHSHCD3C3SCCCHSHHSDKf 1 

If . • Arufi-.' - r. • • \ - •■' . u: t . 

'■•■';:.!-*;!.Lf-Ki:: :^:«^^^A.:A./.^ . :v^T.;:^:Av:':':-v!'iy,:./. : :•^c::t:v:r:.l■ 

MHWWEOLREDSVTSDFNREEFLRNVPESU^CL'yKVPAVIK 

CPn_0003 889 2370 

gacA-Glu CRNA Gin Amidocrans ferae % 

K IMYRYSALELAKAVTLCELTATGVTOHFFHRI EXAEGQVGAF ISLCKEOALEOAEXr DK 

KRSRCEPU:KlACVPVGIKIWIH\m3UrTTCASRVLENY0PPFOATVVERIKKOT 

KLNMDEFAMGSTTLYSAFHPTHNPWDI^RVPGGSSGGSAAAVSARFCPVALCSITrcGSIR 

OPAAFCCVVGFKPSYCA\^RyCLVAFASSU»IGPLANTVn>VAl^WDVrSGRDPKnATS 

ftEFFROSFMSKLSTEVPKVlCVPRTFLECLilDDIRENFFSSLAIFBGEXmiLVDVELDIL 

SHAVSIYYIU^AEAATNIJViirOCVRYCVTlSPOAHTISOLYDLSRGBGFCKEVMRRIL^ 

trrVLSAEBOPA/VTrKKATAVTWCIVKAFRTAFEKCEILAKPVCSSPAFErCEI 

OD lYTVAMNLAYLPAI AVPSGFSKECLPUILOI IGOQGODQQVCOVCVSFOEHAOIKOLF 

SKRYAKSWLQCQS 

CPn,0004 2334 3833 

gacB-(PeclI2) Glu tRNA Gin Am idocrans£ erase (B Sutninic) 
EICQKCCSRRSIMSAVYAWZS^aCLEWVEl^ASKLFSSAI/^R7CDEPNTNISTVCre 
LPCSLPVLNOSAVEKAVLFGCAVECEISLX^RFDRXSYFYPDS PRNFOITOFEHPI IRGG 
RI KAIVOGESKYFEIAOTH lEDDAGMUOiFGEFACVDYNRAGVPLIE WSKPCMTCPEDA 
VAYATSLVSLLOYIGI SDCNMEBCS IRFD\^SVRPKGSPELRNKVEIKNMNSFAFMAOA 
L£AE3C0R0IDEYLWPrnCDPKLVIPAATYRWDPEKKKTVUlRIJ^ 
LOLTESYrERrRm.PELPyDKYHRYIO£YCLSEDIASILISDKNIATFFEVACKDCKNF 
RSLSNW\m^EFOGRCncrLCVKLPSSGIFPEX:VAOLVNAIDOCVITCKIAKEIADI>0«^ 
GKNPEEILKEKPEIXPKSDECEtflKI lAEVVIJ^ES IVCYK^G^C^KALGFLV^ 
AGKAPPKRVNELLUXLDKG 

- CPn_0005 4097 6992 

prep_l- Polymorphic Queer Membrane Procein 

SDIHFDLCTKMRFSLCGFPLVFSITIXSVFUrSLSATTISLTPEDSFHGDSONAERSYKV 

OAGCWSLTGDVSISNVDNSALNKACFNVTSGSVTFAGrmHGLYFNNISSGT^^ 

CODPQATARFSGFSTLS^IOSPGDIKEOGCLySKNAI>aX^^JYWRFEONOSKTKGGAIS 

CANVrrVGNYDSVSFYONAATrGGAIHSSGPLOIAVNOAEIRFAOrn'AKNGSGGALYSDG 

SIOIDONAYVl^ENEALTTAIGKCGAVCCLPTSGSSTPVPrvrFSDNKOLVFERNHSIM 

GGGAlYAm^ISSGGPTLFrNNISYANSONLGGAIAIDTGGErSLSAEKCTITFQGNRT 

SLPFUWIHLLQNAKFUOOARNGYSIEFYDPITSEAIXJSTOLNIfCDPKNKEYTCTrLF 

SGEKSLANDPRDnCSTIPONVNI^ACYLVIKEGA£VIVSKrrQSPGSHLVU>LCrr^ 

KEDIAITGLAIDIDSLSSSffTAAVIKAMTANKOISVTDS lEL I SPTGNAYEDLRMRNSQT 

FPL^^l^PGAGGSVTVTACDFLPVSPHYGFQGNWKIJW^GTG^nCVG^^ 

SGGYVLSVNrrcrTPKHrrSMAFSOtrSRDKDYAVSNNEyRKiru:SYLYQYTrSL^ 
ASRNPNVWGILSIUirLONPLMIFHFLCAYGHATrrowarDYANFPKWN 
OGSMPIiVFETCRIJOGMPFMKLOLVyAYCJGDFKETTADGRRrSNCSLTSISVPLarRF 
EKLALSODVLYDFSFSYI PDIFRKDPSCEAALVI SGDSWLVPAAKVSRHArVGSGTCRYH 
FWDnfTELLCRGSIECRPHARNYNINCCSKFRF 

CPn.OOOe 7299 7141 

No robusc homolog present in Genebank/EMBL as of 11/7/98 
K0LQEPIJlSAU-£RLSE>^LVlXGVPSPETTRSTPEKaW4QLPKDSR^ 

CPn_0007 7488 10496 

No robust homolog present in Genebank/EMBL as of 11/7/98 
KSFRYNLSLIFSFLWIPLTDSTTSSLSTSLLDEGNPOSHRKLRILAIVLIALS r ILIAC 
GVV^.TVAIPCt^S^^SSPAC>KaCALGCVMLALS^DVL^QCR£^^IVL ^ i ' lTPGI\:# S 
PRSGIS ISGADSTIRSLPTYLLDE)CHPOSMRKLRILAIVLIVFS r IHASGWLLTVAI P 
GLSS\aS5?ACMGACAIX:CVMIJ^IDVUJaCREVPIVLASV m ' t^TC SPRSGISISGA 
DSTIRSLPTYPLDBGHPOSMRKLRILAIVLIVFSI rLIASGWLLTVAI PGLSSI ISSPA 
EMGACALGCV'MLALGIDVLLKKREVP IWPAPI PEEWIODI DEESIRLQOEAEAALARL 
PEEMSAFECYIKVVESHLENMKSLPYKKGLEEKTKHOIRVVRSSLKAMVPEFLDIRRIF 
EEEEFTrX^ARKRLIDLATTLVERKI LTEOLERNNLRKAFSYLYODSIFKKI I DNFEKLA 
WJCFMI LSKS ICRFTII FENHEHCVAKSLLHKNAVtXEKVIYRSLOKSYRDIGKSSAKMK I 
LHGNPFFSLEDTOCKTIMKEHAEMLESL^SYRKWLAI^DENVVtyrPSDPKWraLSGIPCR 
OAl^EISRDEC^^KKAHLKHOESLYTOARDRLTDOSSKENOKELEKAEOEYrSSWERVKK 
FSIERVOERIRAIOKLYPNtLEREEETTGOETVTPTVOGTTASSDLTDrLCRrEVSSBED 
NOWESCVKVLRSHEVTMSWEVKOEYGPKKKEFODQMGSLERFrTEHIEELEVLOKDYSK 
HLSYFKKVWNKKEVQYAXFRLia'LESDLECILAOTESAESIXTOEELPILATRGALEKAV 
FKGSLCCALASKAKPYFEEDPRFODSDTOLRALTLRLOEAKASLEEEIKRFSNLENDIAE 
ERRLUCEGKOTFERAGU^VUlEIAVESTYOUlSLTNTWECTPESEKVYFSKYLtrfYf^ 
RRAKTRLVEMT0RYRDFKMALEAK0FNEEALL0EEL3IOAPSE 

r:Pn„0008 10780 11685 

NO robust homolog present in Genebank/EMBL as ot 11/7/98 

' :KYSYLLNY PPPPRRSLCVSCSKLR2LS ITLLVLGVLLLTLC r PGLT.^C I3FCACLGFSA 

LGCr/LVISCLLFLLVRREVPTVRSEE I PRCVSVTPSEEPALEKAOKEPETKK I LDRLPKE 

LOOLDTY lOEVFACLERLKDPKYEDRCLLTEAKEKLRVFDWEKDMMSEFLO rORVXWEE 

AYYVEHCODrLENIAYEIFSSOELRDYYCACVCCYLPSGDARADRLKRSVKEVMDRFMRV 

TVk.WEA.SVMLDHSYGVARELFKKAVGVLEESVYKI LFKSYRDAFYECEKAK IQRDGRFK 

WL. 

rPn.OOU'J tU89 IHV) 

NO rohur.r t\orrH»lofi pr**s*»nc in HeneUmk/EMBL as oc U/7/ob 
t/r;;AlIAFX)IU-RDItrx>/E0LK0TtFWVGFJIDCTOrETVRK;:CMWLDRVADKFtLREKEEK 
MERHELn (A1wnKA:X:i rAYAKAK/\AFEK£R:;Nt:NOR KVKDVEKWLSKv' L^EFRNOESRR 
AR ERLh liU'^rLY PEV.WKERVLERORTKKVNLENLV AD I EKK YHI ICVR EOEH'WK EVENK 
rVVUYREfIt ItKVt J ;aEEV: ;KCLCU<LE[>:Lfc.TW.':KK LTKAEEUVKEMK FDATEKLGNKVLSO 
•/rNRLE I urFJ)AEEM tFR t EE t EMTLRMVELrLLFMK^^•FEKA.*>U?YN^;CKEMLAKV£PO 
ESPTYR: :cRI<r .ER t^inu>-rA'rrrK:OEftl<K:F:;DLEGKVR'rCR DI ILREOMKHF EVOC 
i Ur rNElXLWVi :Ar:i.rrOARl.DLVArVPYHEFYLOYirM tKREKVR::OWM,\KTERYREIRO 
Al-vjf Rl n a AKrrr 1 1 J<F.EDm,I J* DnWL(.n OERKHr<yRR r . tl NK t AAA(/jkVKOF 

'.iitjJ'iin \\\?A lUL"". 

n»» n.l»i::t htKtir.|.«i proiiMfir ii> < ;^iMikjiiK/f:KOt. .is or 11/7/'»h 



CKYr/'LR5YPPPPCHSVr:3 . JKLRVLAtTFLVrrjlLLL: JCAlFLTLa:PCLSAArS 
FCLC ICLSALOCVXM rSCLLCLLVKREI nVRPEE I PECV3LAPSEEPAU3AACKTLACL 
PKELOOLSrrO rOEVB^SLRma3SIWEaRSELNDAJCKEXHVFC^lr^Elm4T?E3J^ 
A0EXarfDLNFLINOCRSU«TAESESLCtFKVSKRLCTLPSC0VRCtCiaOCSAKe?VAMJ( 
SLHCEIHKVAVAF0RNSYAMAEKAFAKALGALEE2VYRSLTCSYRDKFLE5ERAKI PWNG 
H tTVLRDDAKOCCAEKKLCMPRNVCRNLGKOSFC 

CPn_OOlO.; M2«B IS746 



FLKRUIRKCALAJCTTFEKKP^KKNUJAVEEANARRLKWRDW^OOEFO 

YP EVSVS I RENK lOETRSNLEKAYEA I EENYRCCVREQEOYWKEEEKREAEFRERCNKIL 

SPEEL£3SLECrDHGUarfFSEKLMEI*ECHIUCL0KEATAEV«2«IL5DAESRI^IVr^ 

KEMPCRIE£IEKTLRMAELPU.PTXKAFEKACSOYNSC>EMLEKVKPYC!CESUYVrSK£ 

RLVSU)EDLRRAYTECOKRFOGDSCLESEVRACREOLRERIOEFETXU3I.7EKELLCVS 

SRrJOTECOCVSCVKKEAPPCKKTYACYYDEmVRVCSRVWTMSSRLREX^ 

ACLSEEDr\rUCEE£WLYREERKNKEKRLVGTKIVATCORIQEF0PSDIVESSNEI^^ 

OKAAFUrmEDHS 

CPruOOll 15377 1(5614 

gatB-(Pecll2) Glu tRNA Gin Am idocransf erase (B Subunic) 
FWYSIKTAAPAI LHVS PTPPEETKFVI PKDSKSRAIjC:TLLVV,T3IU.WCCAIVLSGVIS 

glsalivcgu;istislcvvlrvlx;liu.uucreltleoieakoiaetfadelkel^ 

Ostekslekiecsrysoqgflnratqkildlssslssitsefrdlrolfdeekiellsge 

rllefiaanlfkogrdvyujlcnuvdiraymgpnfrttcvakvxekakavwef^ 

ELETFT 

CPrv_0ai2 16596 18212 

gate- f Pec 112) Glu cRNA Gin Amidotransf erase (B Subunic > 

G IRVmJCNKYCIJJCCKYOEKlJUXERIXYNSVOKSYADRLFSYEXTKM^ 

0KE3CCAEAEKAFLEQ0KILLDYCKS I FWLNEl^EINU^DPWSWCLimmTRKVFOEVDDS 

ERWNHKVLlQKIXDDYEKU.EESSKESTEA^nCKLI^DLVDRLEDA^CW 

VKDUUimXnVDPKOOTEAiOCKVELEASLETFLDS I ESELVOCmXJDIYWKEODVKDL 

ARTgEL££0DI£AKR££AA£0Ul5LN£3UJacSKTMU}RAKWHI£21AEDSITWrS0I^ 

IMCARLXILKEDXTSVLPEIDEIETCI^lXELPLLTTREliTKSYIJCnciCSCTLtJafr 

vrENNIYVOEV3\raWNLCFKUX3IS0RFGKK0DDFANLEEOVALOKKRUlELT0N^ 

GFNF>OCEDnCAAAKDLYrRSTAB0KKNFWPCMElJ^RRYHEEVrflCPLLELKYNCA^ 

AKKKlCSUUJJEKEMKEIKKEErYOKXQQRHADRSRHTTYOKUlIAEEIJ^^ 

CPn_0013 19509 21106 

prep_2-PolyTnorphic Outer Membrane 'Procein 

LRDRlArFryiXYWKESPLREKK\A/KKIPLRFU.ISLVPTLSMSNUXSAATTEELSASNS 

FKJTTSTrSrSSKTSSATIXriTm^FKDSWIENVPKTCETOSTSCFKNIW^^ 

GrSmSNIDATTASGAAIGSEAArncrVTLSCFSALSFLKSPAST\m«ajtaiINVKGN^ 

LLaraCVLIO»IFSTGtXXUrNCAGSLKIA^i^nCSLSFICNSSSTRCCa^IHTl^ 

rrUOTffAPTAAGKCXy^IAIADSCTLSISGOSCOIlFEGOTIGATtnVSHSAIO^^ 

KITAUUAOGHTIYFYOPirVTGSTSVADAUJrNSPmuraJKEVTCTIVFSCEX^^ 

KDEKNRTSKUONVATOCTVVUCCDVVLSA^CFSQDANSKLIKD^^ 

NtEINIDSU^KXIKLSAATAOXDIRIDRPVVIJ^ISDESrrQNGFLNEDHSYDCZt^^ 

AGXDIVISAS5R5IDAVOSPYGYOGKWTINW5TDDKKATVSWAK05FNPTAE0EAPLVPN 

ST»fPOCI7^I^LCFAOLFARDK0YFM^m^^AKTYACSLRLQHnASLYSVVSZU£BC»^ 

EIUJWSKTU>CSFVGOLSYGHTDHRH}CTESLPPPPPTLSTDHTSWXYVW 

AVEKrSCRGFFOEYTPFVKVOAWAROOSFVELGAISRDFSOSHLYNIAIPLCIWLEICRF 

AEQVYHWAHYSPDVOlSNPKCTTTLLSNQGSWKTKGSNLAROAGIVQASCniSt^^ 

LFGNFGFEMRCSSR5YNVDACSKZKF 

CPrv.0014 21365 21922 

pBip_3- Polymorphic Oucer Membrane Procein 

lOtCSIYFTMKSSFPKFVFSTFAIFPLSHIATETVLDSSASFCXINKNCMrSVRESOEnAO 
rrVLFKCNVTLENI PCTGTAITTCSCFWTKCDLTFTGWMSU^CTVnACWAGA^ 
\Aa3KSTTriGFSSI^FIASPCSSITTGKCAVSCSTGSLSLTKMS\rcSSAKTFQRIHAVLS 
PQKLFH 

CPn_0015 21335 24174 

prap«3-PHP_3 t frame-Shi £c wich 0014) 

LEFDKNVSUJ■SK^^STDNGGAITAKTLSLTC^MSALFSEIJTS3KKGGAI0TSDALT^T 
GWEVSFSONTSSOSGAAIFTEASVTISNNAKVSFrDtnCVTGASSSTTCWSO^ 
KTSTtyPKVrLTCNOKLLFSNm'STTACGAIY\naCLELASCGLTLFSRNSVhC^ 
rAIEDSGELSLSADSGDrVFLGfnVrSTTPGTNRSS I DLCTSAKMTALRSAACRAIYFYD 
P ITTCSSTTWDVUCVNETPADSALOYTCNI I FTGEKI^ETEAADSKNLTSKLW 
GGTLSLKHCVTLOTOAFTOOADSRLEMDVCTTLEPAOTST INM^VINI SS IDGAI^^ 
TKATSKNLTLSCT ITLLDPTCTFYENHSLRNPOSYD r LELKASGTVTSTAVTPOPIMGEK 
FHYGYCXnvCP IVWCTGASTTATFNWTiarrf I PNPER ICSLVPNSLWMAFIOrSSLHYLM 
ETANECLOCDRAFVCAGLSNFFHKDSTKTRRGFRHLSGGYVICGNLHTCSOKILSAAfCO 
LFCRORDYFVAKN(XrrWCCTL-rrOHNEr/ISLPCKLRPCSLSYVTTEIPVLFSCNLSrr 
HTDNDLKTKYTTYPTVKGSWGNDSFALEFGGRAPICLDESALFEOYMPFMKLOFVyAHOE 
GFKEOGTEAREFGSSRLVNLALP IG I RFDKESOCODATYNLTLCVTVDLVRSNPDCTrTL 
RISCDSWKTFCTNLAROALVXRACNHFCFTJSNFEAFSOFSFELRCSSRNYNVDLCAKYOF 

CPn_0016 24333 2S188 

p(np_4 -Polymorphic Outer Memtr'ane Procein 

RSDFALKRCCHHRSSFSLLLI3::3LAFPLLMSVSADAADLTLCSR0SYNGDTSTTEFTPK 
AATSDASGTTY I LDCDVS I iVAGKOTGLTTSCFSWTACNLTFLGNCFSLHFDN I ISSTVA 
CVWSNTAASGrTKFSCFSTLRMLAAPP.TTnKGArK ITDGLVFEi! tCNLOLNENASSENC 
^^I^^rTLSLTCSTRFVAFU:NC3300GGAIYASGD3VrCE^^ACIL3Ft:NNSAT^SOGAI 
3ABCWLVtSNWN t FFDCC)C\TTT rx;A r ry^iKAOANPOP t LTL3CNESLHFLNNTAGNSC 
'TAIYTKKLVLSSGRCCVLFriNNKAANATPKCGA I A I LDSCE I CISAOLGN 1 1 FECWTST 
TOSPArnnHMAI DLA^JNAKFLNLPArnOriKV I FYDP ITSr/yVTDKUILHKADAGaCwrYE 
'if r VFr/;EKLSEEELKKPE3NLK JTrr'^/v/KLA/VyVLVLKUrr.rrVV',Wr fTOVBriSKWKD 

'yrrTFF^v:AKWTLNc:tj\iNrD::L[XTniKA r t kataa::kdval:vp fMr.vnAOfJWYKHH 

rtU'WZ-'Kf LrMUTAOf^TMTTTL H'nr\':tJnTUlVfiVf{jl'Mttta*^:ii'jm -NirKNKKCYLN 

u> 

'.('njMHV .:7iy(l 

t w. tTMr: [ K( rn ; i ivw/nnATAKTKNATi/rvn-K'p'r/Kr'Nrrftv :n.vr+i::t*#:::KVDVRS 
I tj:'.t Mitn'sv: :: :u :::: nil! MV:\ ; i Aiii-i jii:ujti> iHtjur.'fuu: :r,Ai ;vau y » :FfTA:;ENFFN 
KAFr vt.i :vnKi H M .VAKwr mtvv// ;am; :7M i t xU'jyri akh.:j :n: :i .pfvi-tiarfayg 
I m ♦iNMTTK VT! :y 1 -v w *jla n** : i !•/ ? :a ( i v/a: a :m n; wvi »ri rrc FC i^i ( yai lo 
rir/FKWir rrw ;h::iu:iu>i .fnc^wk^ ; t K!.-KKK;:rjK:rrriJi^; iAY\*n/v r rni n^yTTTLM 
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vncDcwrrrf.'TT.'rioROAtL'-ii.vy^HHAFASNrEr. ,fd/elrcs3Rsyaiolocrfx;f 

CPn.OOU 2751 J 2V0O) 

pmp 5 -Polymorphic Outer Membrane Protein 

SYfWICrSVSMLU\LLC3CA33;VIJ<AATTPLNP£DCFIC£i:mTn*FSPKSTrDA^ 

C^TGO^Y 1DPCKCC3 rTCTCr/ETACDLTFtX:NGrn*U:Fl.SVDACANIAVAHV^ 

rr 0FL3 LVITE5 PK3A\mt;KGSLV3LCAVOL0D mLVt.TSNASVEtXX7/IKCNSCL 10 

" r K?i.TA r Fr;o^^TnsKKor;A [.ttto^ltt EJftn>CTUCFNEKKA\rrsocALDi£AA^ 

::r/^::lT;:'-;^f' ^:al;,v\:':!^^ • . !r ::;:iv\rMATrL;y;A rrr:rrv:;'r..;:.r-? 

'X^DrVFECNOVTTTAPNArrKWA^IHLESTAXVrCLAASOCNAiyrYOPiri NtrirSASDN 
LR INEVSANOKLSGS rVFSCERLSTAEAIAENLTSRINOPVTLVBGSLVUCOGVTLITQG 
F30EPESTIXLDLGTSL 

CPn_0019 29007 30356 

C»np_S-PMP„5 (erarac-shifc with 0018) 

ASTE33 mrm^ INAOTrrcKNPINIVASAANKNITLTCTLALVNADGA^^ 

0YSFVKLSPGAGGTIITODAS0KPLEVAPSRPHYGyCX;HWhMJVIPCTOroPSOANL£>W 

RTt;VT.PNPEROGSLVPNSLWGSFVIX)RMOEIMVNSSOIJXQERCVWGAGIANFLHM 

NEHCYRHSCVCYLVCVCTHAFSDATIWVAFCOtTSRDKDYWSKNHCTSYSC^^ 

EFRSPQGFr^^SSSEJW:CNOVVTII»tOI^SHRN^roMKTKrtTYPEAOT 

CATrYYYPNSTFUT>YYSPFlJUOCTYAH0EDFKETGCEVRKFTSGDU^^ 

RFSrcKRCSYELTLAY\^DVIRKDPKSTATIJ^GATWSTKGNNLSROCXOLRLGNHCLIN 

PCIEVFSHGAIEUlGSSRNYNINLCGKyRf 

CPn,0020 32717 30603 

Predicted OMP { leader (H) peptide: outer membrane) 

r HKNLRI0A^^CVVVO^VC0SUaVAHGNVMVWRAm»VCDYLEV^^ 
RFAMypWFLGGSMrTLTPCTr/IRKCYISTSECPKKDLCI^DYLEVSSDSIXSIGKTTL 
RVCR I PIIJ-LPPFSIMPME I PKPPINFRGCTKKIFUISYtXaiSYSPISRJCHFSSTFTLDSF 
FKHCA?CMCFNUlCS0KaVPEWmihKSYYAHRLAIDMA£yWt3RVTUJ^ 
GEYKI^DSVrerVAOirPNNFMIJCinGPTRVrxriVJNDWFEGY^ 
-VLTUtQYPISrYNnWYLENIVECGYIJffArSDHIVGEJirSSIJUAW 
TI^STl^SSLIVYSDVPE I SSRHSQLSAKLOLDYRFLIilKSYlORRHII EPFVTFITETR 
PLAKNEDHYIFSIOnAFHSLNUJCAGIOTSVLSKTrn»RfPRIHAKLW™ 
FPKTACELSLPFGKKNTVSt^AEWIWKKHCWDHMNIRWEWCNI»A^A^f^IXSIi^RSKVS 
IKCDRENFILDVSRPID0LLDS?LSDHRNLrLCKrJ^PHPCW^m^^LRYGWHRODr^PN 
VLEYQMILCTKrFEHWOLYGVYERREADSRFFTn-KLDKPKKPPF 

CPn_0021 34470 32707 

Predicted OMP (leader (19 J peptide J 

CSRSPYPNIEIIARGVEHRSMGUT(LTtJXn.LirSLPrSLVAKFPESVCHKILYISTO 

LSOAMETADPLOOIiVI^AVSGHLCKTSDDLLrKAIJ^PYPVIRIXAAYRIAN^ 
DHLHSFIHKI^EEIOCI^AAIFUU.ErPEESDAYIRDIJLAAKKSAIRSATAWIGEYQQra 
FLPTLRNIXTSASPODOEAILYAljCKLKDCOSYYNIKKOLQKPDVOVrr^^ 
EEDAU^IKKOALEERPRALYALWiLPSEIGrprALPIFUaKNSEAKLWArj^^^ 
DTPKLLCT ITERLVOPKYNErTLALSFSKGRTLONWKRVNI IVP0DP0E31ERI.I*STTRCLE 
E0ILTFLFRLPKEAYLPCIYKUJ^QKTOLATTAISFLSHTSH0EAli)LLrOAAKLPGEP 
I IRAYADLAIYin*T}a3PEKKflSLHDYAKKLI0ETU.FVDTe*QRPH PSMPYLRYOVTPES 
RTKLKLOILETIATSKSSEDIRIilQIKTEIinAKNFPVLAGLLIKIVE 

CPn_0022 35042 34395 

maf 

T I LOVISTTCCWSNTRSFYSMSLPLVLGSSSPRRKF ILEKrKVTFTVTPSNFDESKVSYS 
GDP r AYTOEIJ^OKAYAVSELHSPCDCI ILTGDTIVSVTORI FTKPODKADAIOMUCTUl 
^^C7^Hrw^SrAVU^KGKIXTGSETS0ZSLTMIPDHRIESYIDTVCTU^ 
ILKKVHGCVYWQGLPIOTLKYLLEELNrDLWDYSI 

CPa_0O23 36657 350U 

yjjK/alr-ABC Transporter Protein ATPase 

ENRAKLLYSKOHFVMLSAMSIVUDKIGKSLGTRILFDDVSWFNPGNCYGLTGPNGACKS 
TL LKII MGM I EPTRGS ISLPKKVG ILRQN IDSFHDTrVLarVIMaNTRLWEALORRDNLY 
LQEFTDAIGMEt^EI EEI rCEENGYRADSEAEEIXTG XC I PNE14FDKKMAMI PIDLOFRV 
tXCOALFGHPEAlXIXEPTNHLDLYS INWU^iFlJCDYEGWI WSHDRHFLNTITTH r AD 
IDYOT 1 1 r Y PGNYDDMVEMKTASREOEKADI KSKEKKISOUCEFVAKFGAGSRASOVQSR 
LREIKKU3 POELKKSN rOR PY I RFPLSDKSSGKWLSLEAITKDYCDHOVr HPFSLEIYO 
GDiCLG:iCNNGU:icrrLhKl^GVEAPSSGSIKLGHOAICSYFPO>mSDVIJUX:GOETLF 
cWLRNRKTG INIX:EIRSVU;KMLFGCDDAFK0 rOALSGCCTARUii^Ol^ 
EAmiLDLESVSALSWAINDYKGTArFVSHDRGLIODCATKLLIFDKDKITFFDGTMVDY 
TAGHKOU. 

C?n_0024 37605 36661 

xerC-lntegrase/recombinase 

REVMIASIYSFLDYLKMVKSASPHTLRKYCLDLNGLKIFLEERCNLAPSSPW^LATEKRK 
VSELPFSLFTKEHVRMY I AKL I ENGKAKRT I KRCL3S I KSFAHYCV lOK I LLENPAET r H 
GPRLPKELPSPMrMOVEVLMATPDISKYHCLRDRCLMELFYSSCLRISEIVAVNKODFD 
LSTHLI R I RGKCKKER 1 1 PVTSMA lOWIO I YLNHPDRKRLEKDPOA I FLNRFGRR I STR3 
I DRSFOraRRSCLSGK r TPHT I RHT lATHWLESCMDLKT IQAUjCHSSLETTTVYTOVS 
VKLKKQTHOEAHPHA 

CPn_0025 38610 )7(584 

eUC/atsA-SulphohydroUse/ClycosuUatase 

E LMSSREL r I LCCSSOOPTRTPJ^OGAYLFRWNCECLLFDPGECTOROri FAN I APTTVNR 
rFVSHFHGDHCLCLGSMLMRLNUJKVSHPIHCYYPASGKKYFDRLRYGTIYHETIOWEH 
P [3EECrVEDPC3FRIEA0RWHOVDTLCWRrTEP0TrKFLPKELESRCrRGLr lODLI R 
DOE r 5 Ify;3TVYL30VSYVRKGDS I A r I ADTLPCOAAI Dl-AKN3CHMLCE3rrLE0HRHL 
AEJHFHMTAKOAATLAKRAATOKLILTHFSARYLNLDOFYKEAGAVFPfWCVAOEYRSYP 
FPRNPLUfK 

CPtt.Onii*. I'lft 37 ISTfin 

IT MS tr/pi^rh.rr icji piot^^in 

. :mf.\m:;m I . I P: :u<i i: nrprTYFiiK Pijp [ KO»\A^' : Ki: 1 1( u I ^•N r AYL 1 1 rfrvi-V-z-r/i.^A-jAML 
• Mr rr-;^ :( pu;i.;:;r^i.LVLu; [rNrL^iNwrsTKrpKKtArKDAr:E:;yrTK:;A:;HKG.'; 
lvu:I1un^lKt■K^IFrf<T^.ll.EKf;vN^v^NKTK:;t;FE:;r•HIr:uKll^::pRy::Kf^'::;EIF^^^ 
.:::ia:rjfi'KAKi:KAi*in'ArrKic::K-r:rTTr^:::KKKKKTKn;:rjrRrrr;:;tnKnr;ArKPMVpr>K 
KUKt'Vij.KKrvi'i,i-rEni.rMO:;:u:NF^::;tj;:;;:;PrfVOPKA[LfwrKoi'rDP 

'Itijio::/ J i.,77H ^^T"^^ 

Kilt I* lit A'I'I* • l^jfrf.'rul.inr V'riU»M!-.t; 



PSERTtVDSrrNSDSPtLD. .rvEKU.CE3EEE3ED0STERLLPSELFILFLNKRpFF 
PCMAAP :LI ESCPYYEVU<VLAK:;::CK*i*:GLVXTKKEIlADIUCVSFfCU<KTCVAARILR 
IMP r ECGSAOVLtSI CER TRtn ZV :(KnKrcKARVSYH*Or«ELTEEUKAXSff S^ySSSKD 
CXKt^PLFTCEEL0rn:cHCDrrErGKt3A!)fUVALTTAt?EeL0EVL^^ 
LLKKEU3LSRL0SS INOK: EAT ITKHCKEFFUCEOCKT IKXELCLEKEORAIOr EXrSER 
UlKRKVPDYAMEVrCDEIEKlCri-ETSSAErrVCRffYLnWLT 1 1 PWG IQSKEYHOUOCAE 
[VLNKDHYCIJEIKOR aEL:3VCK:.SKCU<G3 : ICLVCPPCVCICrSICRSIAKl'LHRKF 
FRFSVCCMRDEAErKCHRRTYrCAMPCKMVOALKOSCAMNPVrKIDEVDKICASYHCOPA 

•:• • . . ; \yrv::/'^^^? ^- y • : ."-a. -i" . * ■■■ a: ..•"•■m : :.'\VAnE,va-pT:.\"^j r fCK-^T-RKV'A 

LK X VONOEKPKSKK ITFK : JoKNLs/rVU;Kr iFJ JDRFT|-£STPVX;VATGLAkT^LGGATL 
Y r ESVOVSSUCTDMHLTCOAGEVMKESSC r AWTYLHSALHRYAPCYTFFPKSOVHIHIPE 
GATPKDGPSACITMVTSU^liLETPVVNNLGmGEITLTCRVLCVQCIREX^ 
LNtLIFPEDNRRDYESLPAYLKrCLKIHFVSHYDDVUCVAFPKLK 

CPn_0028 43328 42543 

Wo robust homo log present m Genebank/EHBL as of 11/7/99 

RMf LOFFHP IvrSDOSt^FLPYU:KSSGI lEKCSNIVEHYUiUXIDTSVI ITCVSCATTL 

SVDHALPISKSEKIIKrL^YILILPLIIAIXIKrVLRIILTFKYRGLILDVXKEI^^ 

TPDQEm^LPLPSPTTUaCIHAUilLVRSCmrfSLIQECFSFTKITOLCOAPSPICOOtC 

FSYNSLLPNFVFHSLVSWNISGEERAI/m^KEOCEE^«VKL^CTMOACSFVFRSLHlJ?SM 

QTKDKKAGFGIXTFFPWKIYPL 

CPru0029 43839 43390 

No robust homo log present in Cenebank/ EMBL as of 11/7/98 
SNWERNENIYCFNLFTlYIRFFAALNIRMNDGLRFCYSYIUJlPMIXDSStiRKOOOELL 
KKFOIKLRTTSI KSSL rSLROOLCKREATOSOILYCTSRFOYLNSFEIEDPRI PPTKAAO 
LQEITWSRSVMEUCIKFYVYLNSERNKTKP 

CPruO030 43840 44529 

gcp-0*Sialoglycoprotein Endopepcidase 

UCGVCWYSLFnfiramRMYrYKYVrri7rSCYYPFIACVDNQ0VLE}« 

FLFKSKNt^FOGVAVAUSPGNFSATRIGISFAOGU^MAKNVPUiCYSSUX^V^ 

AIJttJ»LCKRGGVLTWSErPEECl/IE}aiRGVGP(iUXSY£EASmr^ 

LFASSFSDKITVEEVAPSVEOIRRHVISOFMFVEYDKOLSPDYRSYSCIF 

CPru0031 44708 44884 

rs21-S21 Ribosooal Protein 

CMPSVKVRVGEPVDRAUlILKKKIDKBCIUCAAKSHRrraKPSVKK^^ 

CPn.0032 44881 46098 

dnaJ-Heat Shock Protein J 

SLIGNVVFVGSVSGMDYYSILCXSKTASAEEIKKAYRKlJVVKYHPDWnCTWU^^^ 

VSEAYEVLSDPQKRPSYDRFCKIgPFAGAGGFGCACGMaWEDAUtTmGAttJ UE fa ^ 

FFrxaTOMEAFGMRSDPAGAROGASKKVHINLTFEEAAHCVEKELVVSCYKSCriCSC 

QGAVNPOGIKSCERCKCSGOVVOSRCrFSMASTCPECOGECRIITDPCSSCPaOCRVKDK 

RSVHVHIPAGVDSGMRLKMfrrygaWMMe^gmLVVPTr^ 

GFVDAALOOOCEIPTLLKTBCSCRLTVPEGIOSCTriLKVmiWFPfMCI^^ 
VErPONLSEEOKELLRTFASTEKAQIFPKKRSFLDKIKCFFSDnV 

CPn^0033 46129 48171 

pdhA&B/odbA&odbB- (pyruvate) Oxoisovalerate Dehydrogenase Alpha 
& Beta Fusion 

ER5HSVV0KQVIS5IR£n/LKLVWC£JtFAEHXKUX5R05GSGCnTQLSCASH£I^^ 

KSLIICKDWSFPYYRDOGFPIGUXIDI^EIFASFLARTTPtWSSARIIMPYHYSHKJCLRIC 

CQSSVVCTOFLQAAGRAWAVKHSSADEVVYVSGGDGATSaGEFHEHUJFVAIilOU^U^ 

lONNKWAISVPFEDOCGADU^WRCHOGtAVYEVDCGNYTSLTETrSKAVD^^ 

ALlLIDVVRLSSHSNSDWEKYRSALDUCLSMDKDPLIUXKEAIfA/FCLSPFEI 

EAOEEVRKSCEIAEALPFPSKCSTSHEVTSPYTETLIDYENSESAONIJlNSEWVMRnAI 

SEALVEE>rrR0SGVIVFCEDVACDKGGVrGVTRNLTEICrCPORCrNSPLAEATl ICrfAIC 

HALDCIHKPWEIQFADYIWPGIKOLFSEASSIYYRSAGEWEVPLVIRAPSOCYIOOCPY 

HSOSIEGFUOICPGIKV^AYPSNAADAKAUJCAAIRDPNPWFLEHKALYORRIFSACPVF 

SHOYVLPFGKAAIVHPGKDLTIVSWCWPLVl^lXVAOELASRGISIEVIOLRTHV^ 

TVLKSLEJCrcRLLVIHEASEFCGFGSELVATMSEOCYAYLDAPIRRUaLHAPWSKV^ 

ENEVLPHKESILOAAKSLAEF 

CPn_0034 49496 48210 

. CT345 hypothetical protein 
VNFLLPTTCRCI UlAEISTPSLPDSS IVSOKTPPVPDPDSSPOH IPTIPTOAPFKPORKK 
ETPSS I VNAIAFAI LAFLSCLOGVFAICLGCSLE ITHPLFILTAVrr AFTLLYFIKYLEK 
PK IPEPLPTPPPSPTLRAPTLTPEIPAPAPGI PLPPTLPKVDRTKLTCNPDIKYPSTVDP 
KACFSIXKOI^SLDPETRPEORKYSNKlASIUJlSKEKSGFRFHCFKGHFSHDKILNiaCS 
GAWISSHSSMDFSTTLCRAFAVTTCLORSCWEKIKNNIPTPEKHLPIGSCVSCPWDVEE 
GAQLYTSHUVINPPTLETLIKEKMRRAITUCDFSMKEAFTNLVLAYWCrDICIEHNLE 

SVOLEVFGLNNLSADOEEFTTWESCCHLALLESVRILLASKEEYALSNVSVNSISQVPLO 
TACRALFLN 

t:Pn_0O25 51146 49569 

CT33',* hypothetical protein 

ARTTLEEDAGSSLKPLPfCTFPCATALYrTHRRERKSEHOMWNRCOVFSSFTFRYPtSSWL 
I RLRASCECFQORH P I FLCGLYVfLAC ITSRGHPECSALILI FLCMFtPRNPICOWLPLASA 
W 1 1 SLMLTPA PFLHDCPISCTFVI HHAC3CQGTYYGEALC lOTPCCKRAHHLSCOILSESR 
LELKK\r^ELECTLHHTi;0IVF1CSNACYKEIPRSRFYIMKEKCRES5CHFLNHRFPSSEVG 
PFASSLLbCTPLPCWLRDLFROKCLSHLFAIGCWHFSLCATTLWMLCALLPLKIKKIWF 
tVLTSLAC t FPHSLi^VWRSWIS'/TLLCFSWCFCGSCSCUiRLGAGFlLCGIFFSPFSPTF 
Vt^'FLATLG I LLFFPK I FS FLYTPWTOFLSPPWLYP IR YLAMTLAI3LSA0LF IVLPIMO 
YFGi?LPLBCLLV»LrvrFTrLPIIVFLrAT:iLPCCSPtTEAtIOCFLSHPWLHNPNILK 
TLSFAPVPPWMLTLAiILILFFtGILRThA/SPYASrSATTi'RFIETL 

hyiM»rhnr i'Ml pror<»in 
AK::iA/r*:;EftKKMKKI'nNf>:rrKOVRSFFF'FDVLC t EOLf-KEMCWEWrTAK [ r-PLPRCWrEL 
HIU'^KFJjP I DFr.LDl^fKl-.VW: I EHKEr.?r> rCRFF3LLET tEVY lYRLEKEPYOtJCM WF 

uDnift.-i ;iv :i-:rM.i.nrii;MimLr'r'C/:DR»YEKFF.':[HrK;PGKWEDB; r FiwrUwXK'AjOK 

I .I'r.OI .V/MNKW^AKr :Y:;i 1 J t FI'm;YEEPFAYOC FFFDPE [ RP UI :i tP/LLNKETXE 
iiK; :i J-T I El .U 1 1 . ; K: :yvp: :FU.vr . W/UI3EE'/'^NE 

Ki .t^ i c I ' ;i.HA r iLLz:: :\n wtrr r v i vi'pf : twjKfTnr/LErsEKDrrQDo t kei.va n • ivkn 
AA« : rnvf' PA. :v t VH i .i*t * iF.rt (//i l^TYAr;KTI fiAKC tHZ UJMJ'^PT/xsF. 1 1.-/T tH::rF> 
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"TjTrt htypocnecicil pnnein 

MCrTOUSTTJNEEWR I ACrn tViy^MALCKVFFUrrSPLHVRELTLPQEEVEHEIHRYYKALN 
f'.JKJ0tV/UEXJEVTrx;CX:U:EVS3rU?AHLEIMKDPU.TE£W/riPXDRKNAEYW 
MCK r EECLTAVRCM Pr:'-r/CRVOD t HD 1 3NRV IGHLCCCHKSSLCE3D0NL I IFSEELTPS 
rr/A:^ AN^AY I RCFV-r//r.AAT-HTA I V3RAKS T PYLAN I SEELWN f-y^J??! 

tRWLLDYSV r LEWLCA I AKA3UX;:;r KVL I FCVSDVSEI I EVKKKWET I RTRF PKCH 
KVSV<rmiEFPSAVV^IEE:LPECDFLSIC?niOLVQVTU:iSR£SALPKHWiV^ 
RM I HKVLOAAKO WPVS ICCEAACOUJLTPLFICLCVOELSVAHPVINRLRNH I ALLEL 
N::Ct£ITEALLOAKTCSEVE£LLNRNNKITS 

CPn„0039 54256 53963 

CT339 hypochecicai procein — . 
IS^CSGYAKKKKEAKIME0C^LEM£ASLLEKRyEG0ACNGLVSVVI^KaCCOLISVXVQPT 

CLDPEDPEVI EDLFRAAFKLAKEOHDOEMSLMRSTMPF 
CPn_0040 55673 54319 

dnaX-DNA Pol III Gaaina and Tau ' 

A^r^HSt£rr^^fftoPYO/^SRKYRPOI^lEIU»sswAVua^M.VFTfflAAHAW 

CTCJCTTlJ^ItJUCAUNCVHLSEIXJEPaJQCFSCKEIASGSSLDVLEID^ 
IMrrVLrrPVKAKFK I YI IDEVHMLTKEAnWIXKTLEEPPOHVKFFrATTEIHKI PCT 
LSRCOKKHWRrPEKTILEKLSUiAODDHIEASOEALAPIARAAOSSUUiAESLYDYVI 
LFPK5LSPDTVA0AljCrA3ODSUinJ»lAIU)i«]nrATALGIVTDFU4SGVAPVTFUro 
LFYIWUiTNSTrSKFSSOYKTEOU^I IDFUJESAKHLONT IFEOTTLETVI IHI IRIY 
QRPTA^ELISS IKSROFECUWI KEPTLTQC3VSAP0POPTYKEOSFLEKKNQPAAEGKI I 
SVEVKSSAS r KSAAVt3TLL0F AWEFSCILRQ 

CPn_0041 55888 57342 

No robust homolog present in Genebank/DIBL as of 11/7/98 

CKYLYHHSYPPPOHSVGS ISSRYKIJlVtJVITrLVLGVUJ.ISGAlJ'LTUIIPCa.TACVSr 

GU3IGl^ALGGVLWSGLLCU.VKREVSKVCPEEIPAVOPEETPBC?/PVT'PFEKPAU3EA 

QKEOKTOKILI»LPOElIX3t^RyiOEAFACL5PUa)UWEIXX3riODVKEEFQVFB 

DMIAEFVELOOIIXOEGRLLEr^mJOTRyrCRDUTOlEDSLYKLWEWL^ 

LKKSAREWDRFMRTTCNIRKIAm'nJRHWSVAKTAFEKArGALETanrESHi^^ 

~CEYEKAKI^DEEKSAHAE0RFX:DiramWEDVKnAFTVrnCEIXSCIEIDDAI(2J» 

RYEEHRITRARVfnO^AEHOU^TMRVKDSUlEHNEARVAreKERSKENOROVOK^^ 

LRDlJCEUiDOELPRAOERlJlEWALYPEIAVSVVEARREVASDLEKAHESIDKHYO 

EOELY 

CPn_0042 57346 58182 

No robust homolog present in Cenebank/EKBL as of 11/7/96 
FFKFKQEAEFRENGTK IRSMEEVSEYLQQVENQLESCSKRLTKMCrrAJJGVRLEA KEEIE 
SI Il.SDVVNRFEVI^RDIErMI^RVEEIERMUWAELPlXPIKEALTKAFVQHN 
TKVEPVTK£SPAYLTSEERlJOSLNOTt^RAYKESOKVSCt£SEVRACREOLKDQ^ 
QGVSLIKEEILFVTSTFRTKFSYHSFRUn^PCKRLYEEYYDDIDI^RTRARWMAMSERYR 
DAFQAFOEMUCECLVEEAOALRETEYWLYREERKSKKKH 

CPn_0043 58432 60372 

No robust homolog present in Cenebank/QIBL as of 11/7/98 

HHRriMC3VPLSP0LPPPPPDHSVGASFCLSKrRVIAITn.VLGVLLLISC»^ 

VSLCVCLCXSAU;SVLVISGFUIIXRREVSCr/CLECIPTCIP^ 

AKQIUXJLPOELDQLDTDIQHVl^CLGKIJCDUtCKDRGLLKDAJCEKW^ 

FVELQOVMDOESRYLEGLIHEVOSrAHKLFVOOVNIRSHLCESCCYLPSEDV^ 

AKEWARFMKVTRDIRKIAKAFNKNAYtyUKNAFOKArCSI^rCLYXSLTKSYI^ 

KRAKILPDEKNSARAEORFREVKKiWEDLKETVFWKEIXSRIOIEVLTAVG^ 

LILEKRia)KVMSH0tWEATMRVKEAEVTYSVARVAFEKIX;SQOWKKF0EKTKEM 

DIJTOECHRAOERLEKLTALYPEVSVSVVUTEREaKFTILEKAYGNLEERYOSWOOOEDY 

WTEOK>mEAEFRAKCTKVRSMEEVAEHLQILEmiEIX:YKRLSKA£TFAIX;^^ 

EYTILSDAANRLK'/LCEDIEErrLPRVEEIEWtLRMAERPLHPIKOArTKArVOYNRCra 

LAKVEPYYKESPAYVNSEERtOSLDOASOCIORVPKGFKFRhCSKYI 

CPn_0044 60278 60778 

No robust homolog present m Genebank/EMBL as of 11/7/98 
lAKStXrRVWIRLHSAYKESOKVSSLETEACTYREYLREQVQOFETOGVSLIKEElXFLSS 
TLKSKLSYDPL X ANI PCMKFYYCYYDDI DKARAOSRWLEKSERYRNAJCRRFOEIVKKGLF 
KEAKPLKKEEYRLLQEERSNKEKRLIYNKMAVARORVOEFESMEI P£ 

CPn_O045 60961 62790 

CT345 hypothetical procein 

CKYTYHPPOLPPDHSVGATSWOPKUl ILT ITFLVLCVLLLI SGALFLTU^VPCLAACLSF 

GCjCICLSALCGVLWSGLLFFLI RRGVSKVR pee: PVTPSHEAOK I LCOLPOELDOLDTS 

lOEVVSCLGKU(OLKYEDCX;LLTEVOEKLRVFDrVRKDMVTEFLEU:OWAOECOFU^ 

IN0V0SISHKU^DWIGAHLA£irCYLPSGDVRVERU«SAR0WDRF>1RVTCDrr^ 

AMAFDENACCVAKNAFDKAFCALEECVYKSLTE5YREAFYEYEKAKILRNEDVEWL0DKN 

KSARAEORFREVKORWEDLKETVFWKENGC I DLr^LTAVCCWPDRCPEHLI PEKRRNKV 

MSHKLWEATMRMKCAECTYSVARVAFEKIXISRKNOKK.FOEKTKEWLRCLKDLHDOECHRA 

RERLAELEALYPEVSVSWETERETKFKLCTAYCNLEEP.YQSWRIiOEOYWKEEEWKEAE 

FREIOTKVRSPEEVVEYLO r LENLSEDCSKOLT r Ar/*/^;LGVELEATA£FEYTILSDAAN 

RLKVUTED r EDI LPRVEEI EIMLR I AELPFLP I KOAfTKAFLOYNSCKDKLAKVEPYCOE 

SVDYKSCFRV 

i'Pn^n04ft ♦52775 63263 

Ni» roDusc homoloo present in Genebxink/EMBL -as ot 11/7/98 
ER FOHLNODLONWOECOKATCLES EVS AYRDHWEO ITEFETOGLDV t KEEIXFVSSTL 
KSKLS*^0PLlA0rPCMKr/EE\*YDCI0KARV0SftWLEK3ERYRKAKK0r0EMLKECLFKE 
nO/\LKKAEyRLLP.EKRKNKEKLL ICNKI EAAOORVOEF'^P.IDr; 

N«» fot.<i::r tn;in.»li»u pi.»-jifnr tn Oent'bufiK./tXBL as oc 1 1/7 /OH 

Kl I FHLK.VT r.f :( Ff ( l-V l.a ; [ LTH^r*I ! FOK I RMTLTTVir/tJIKrUtKDYELWFVYnCCrEC 

KVKi//r:::;iiKWi. 

• In.O'MM '.u^At riSROl 

•V'MK '.tiii::";! vt.fl hy^Ntchoc iiM 1 IM rror^iln 

MKF.rpiii'^;vMr<Aijmi.:;nowvHVKi.YTFvnc::KivArFTFAwr.KvmTrrKAc:EiSHis 
i.TAr'Mi>F.':j J w:;ai ikkykrtaii t:;EAFt :kvyi it.TL;:r'";M.':Ka*J^ADENTOYWFKKAAD 
l-I.L:rrfiFVlJ::::iVK« *I*KDU' lYrr-LUJKEKICTLF.lM j;i::MKf ;fWlACi:FaiLK iflioen 



CPQPCFOArMDri-KIAfIFE\ ;ch50CVKGEL:jCKRC tEK tTKTTPtLEKYQR IDDRC 
AKtUCQLRA0!X5VTn'LF3Cftjt>Cy^tFV'VLLIU.VOfCALKAU:PEJ4LKSP^^ 
aTL^LLJ<RGTE rFa^-TW(tYLJYjpetftvPJTA\a^uy-FtpGOP lAG(iG^^ 
0LWNNSWFt^INLU:3WariiVSLHflVOTtSSVFV«t?Mlt^^ 

ALYAIX: r ESFVYSL :7A 1 3WAL I PVFEA5FCASTNrSLI.TYt^PENALUCRIJICEAPCT 
YOHSVLVCSLAEAAAOA tCAOSLYCLVAAHYHO ICKLINPCFFSElJOKr UX3SCHSLSPL 
ECAKMIKRH IPECWrtJUlOACLPESF ICVIEEHHCTSVI RSAYVSHMVENPSTCSFOEEl 
FRYSCNXPS3KETT r tMlADSFEAASRSLKNASLPOLORLIDOI lOCKWDCOFSCSPrr 



No robust nomoiog present in Genebank/DIBL as of 11/7/99 
U<EKRiWIVYliVIYOEI FWLmJlO PYYDK ILTOn-IY I PGKTKKDSNKLFQKXSW^ 
VDEKPFSLDCFSNVFLIFVSLVP I AGLVRAYQ I KKSLORTTVQ ICYSPSLSCBOKECVEA 
rVNGYCLICISI LC3lX;iLVP I LI LVVI^IXUjCIUII^SLSTYESII^ 
AT 

CPrv_Q050 66849 66499 

No robust homolog present in Genebank/DIBL as of 11/7/93 
VSWFPILCIFUVMRYAWiOTNWtroDAnCANLCraPSTNCKNAirRKSS^^ 
OGCCILLPIFIIXIJUIXISVLFQLIMLPrRLCCFALRQSVSSPIVrNT IJH^HTTLX 

CPO-OOSl 66797 67111 

NO robust homolog present in Genebank/ML as of 11/7/98 
CFAYLIARNI PRMSmOT-IHPCVLPSSHAODVSRSTVYPSRSF IMRRMUKMNTrmW 
KSSEOLMDGHRIPLI FFGKHHPTIS ILNVNRFSWLS I FYNGERCF 

CPru0052 68008 67304 

hetnC- Porphobilinogen Deaminase 

KMLSVCYSDPCI^DFCOOCRPUII ASRNSNUUCACVHEC ISLLRSWYPKLWrOLSTTCtT 
CDR£XKIPLHLVE2JSYrrriX;vnALVHKGVCDLAIHSAKDLPETPSLPVVA 
U.WADHYVHEPLPI^PRLCSSSLRRSAVUCOti"POGOILDIRCTIEERIIX)UnCH^ 
rVlJOCAASLRU{UmAYSI£U>PPYHAI^LAXTAKDHACKWK0LJTPIHC^ 



693S0 



67986 



CPti_0053 
sms'Sans Procein 

IRMATKTCTCWTCNQCGATAPKWlXKJCPCraWNSLVEEYVP^ 
SIELEJ^ESRIFIOHACSroRILKXWRCSLTUJSGDPGICKSTLIXQTAE^^ 
YVCCEESVWSLRAKRi/IISSPLIYLFPETrrtlJNI KQO lATLEPDILl IDSIOI IFTJPT 
LNSAPGSVACVREVTYELMOrAKSAOnTFI IGKVTKSCEIAGPRVLEKLVDTVLYTBGN 
SHANYIWrRSVKNRFCPTNELLZLSMHADGUCEVSNPSGLFLQE XiUmUSH IIPttEC 
SGALLI EWALVSSSPFANPVRKTACFDPNRFSLUAVLEKRAC^/KLFTODSTU 
KX lEPAAOLGAUAVA^SLYNRIXPNNSmGEVGLQGE IRHVAHLERRIXECXXi^^ 
AXLPGGOZSSLPKEIREKFRLQGVKTIKDAIRLLL 

CPn^0054 70089 69313 

mc-Ribonuc lease IIZ 

TIOT'PPIKIPNSKFKIXyU^MHPPIDITMEAKLNrrriCPKLLEIALTHPSYWIE^ 
VQIEDSERLEFLGDAVIXa-rVTEHLFliFPSMDBCTri^ARASLVNAKACCT 
DYU.IGKGEKI05ERGRLSAYANLFES ILGAVYLDCCLSPARXL'tVPLLPPRCEXLPLMS 
GNPJWIXOOrrOKOFRVLPNnfOSTAVTDAOGNVSYOIOVLVNOEVWGBCMASSiaC^^ 
AAQQALDTYaTOJQNTMDV 

CPH_005S 70096 70590 

cr29fi hypothetical protein 

CFWICYLIRIRMRSAUILQHIJWFHNHGSXIJ-DJIXTIKDCFU.ErKLONFXAKASCT 
TVKWRENIFRSMPEirrVVRKRRU5FFAAELVKRPKI^LVlU3LWVrFCEEII.EX;^^ 
FLU.SGDRACSCIFFTGPYPSDLYELEKCTTCLLLAFSSVGXPVX 

CPn_0056 70917 72746 

mrsA- Phosphooannontu case 

EFUa^LHRISLMKEVE0RIRSLYDAVTAafXCRVa.5NIXnC0DAKTZLCUtX^ 

DLPCATLTFGTCGLRSL^crCTNRI^^JTIRRTTCKLV0VUWtt-PHPCOPI^ 

RHNSIEFAOEr^AKVlAGNGCEVFt^OYPEFIJa.VS^^VRYERAZGGVM^rASH^^>PKy^ 

YKVYMASGGOVLPPUSQEXVAACSAVNCILSVPSIDHPNXHLXGKEYCALYROTLICOLQL 

YPEANRISGRSLSISYSPLHGTCISLVPHVIJCDWFLSVHLVEKOAICTCDFPTVOLPNP 

£OPEALTLGIEOHLArroOOLFIATDPOADRVGVVCL£ZX;OPYRFTOJOKA5LIJ^ 

WSKTRHUJEKDKUrKSLVTTEMl^IAKHYHVDLINVGTCFKYIGEKIESVmNS^ 

GAEESYCCLYGTHVEDKDAX t ASALI AEAALOOKLOGKTIjCDALLSLYETYCYFANICTES 

WFSAKTDEOEIRKKLSHLEEXSSANFFSGKYOVEKFEKYKOCICFNLLSKDSYALTLPK 

TSMLCYYTSGOGRVI I RPSGTEPKIKFYFEMSTHYPERVTDKEIOKOREAESFOHLDDFI 

FOFTEKTSNL 

CPn_0057 72913 73554 

sodH- Superoxide Dismucase (Mn> 

I LKRYVWSFVPYSLPELPYDYDALE P/I SSEIMILHHOKHHOIYINNLNAALKRLDAAE 
TOONUIELIALEPALRFNGGCHINHSLFWETtAPIDCXXX^OPPKHEIXSLIERFWCTODN 
FUCKLIEVAACV0CSCWAVrtjGFCPAK0ELVL0ATAN0OPLEPLTGKLPLU;vi3VWEH^ 
U)YKNVPJ4D\*LKAFF0I INWCH I ENRF3EI I SSK 

CPn_0058 73627 74 562 

accO-AcCoA Carto.xylase/Transf erase Beta 

IRWLVRLFSYDKPK tKVOK IKAOCFSCWLKCNHCHEMIHANELCONYNCCPKCSYHYRIT 
AI ERVKLlJU>KDSWRPLYTDLKSODPLEFIirrDTYANRLEKARKmTESBCVIVC ICTIC 
U^PVALAVMDF»^FMAC3MGA^AraEKLTP.LIEEAIETRLPV^rVSASOCARMOESVFC^>IO 
MVKTSAALAKLHEACLPYISVLTNPTSCGVTASFAALCOtllAEPKALICFACPRWAOV 
ICEDLPEnAOKSEFLLEHCMIOKrVERKEUrrTDyrtXDYFLAOEYTOCKGKAPRDLSKR 
LKEIFLLTODSE 

CPn.nonO 74*562 7S050 

dut-ijlJTP Nu';lfti*r idohytlroldC; 

IKHHTA JCNDN I ICNA I UnVFCELD.':OGELPE-/TTFGAArjVDLRAN I EEP CALLPtXJRA 
L I PTf ; t KAE £ PEnVELOVRPRSGLALKMO tTVUKPCr r DnOYRt lEIilV t LI NFX:D*JTF r 

[EPicMPiAOWL;:rvvoATFWKOE.';u»frrARc:y/:FCi(TOA3 

TPn.Ofi'.n 7S004 VO'i.lM 

pcsN-FTS riA Ptor«*in 

P.KLPEr/EVLV t LEOAKMPLWrOHOOLFriLFJLL^PRLVMFLCKI WROE r Loot .Tni VDA 

aglledkoaffdalvrren I nrr; tcwr/A i pi toKLECc:;riFF tA k; ii rro : t lwu\ c o : 

ALVRLVFL lOCPENAOAEY I JCLL.'7rLTLr;UH:E:;RR0Or*L0VNT I EEVMNVPVf M 
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pcr.N-PT:: tiA Pror.etn » HTii ONA-ainainty Oonain 

« JHEC rtr.:L;VKMDUCLDD/AJLLDVCEHTVL0WUtE3CAI PSYSHNNE/RFSREEIENWIX 

KNOALM tOERt :EDK EALKDL^ LKY3LYKA IHROCVLCDVWHSKEEAIOYASK^ 

LDESVtFDiU'HR QJLMCTC tCEC I ALPHAKDFXINAYYDIWPMnAEP tEYCAIJCKP 

^X:ILF^L^ACODKCHU^LV^«rVHLCKSLNARSFFXWPNKDOtXAYVXE>^ 

.'Pn.OO^r; 7«$l 77600 

■■• r • V' ;:':r • " "i- VT !; : k !'t I AVt„*; : : : f: : r- :F'tu\: ":. :v ;: ; rr.'C IT. t 

t■^/VKKP0KCJE:RK0AKKEPRAJlK0rLVPSSKTL.3ARA^^KMKNSSRKE35CC;CNEISANST 

PRSVKLRRNKRAEQKAAKQGFSAFSNLTLKSLIJ'KLPSKQKTSIHEREKATSR FVNESq L 

SSARKRYC7?SSAAP3LFI^EIVRAPVERTKEU}D^reIHIPVVQV0TNPKEO^^rTO 

LASOAS IQQSECTEOSLREIAOGASLPVLVRSNPEVSVORQKEELLKELVAERRQCKHKS 

\^OALEARSLTKKVARGCSVTSTUlYDPEICAAEIKSI«JOCVSPEARfiOKYSSCKR^^ 

fWKOOKTTPSEDASOEEOOTCACLVRKTPKSOVASKAONFYRNSKmraDSYLTAN^ 

SSEETDWPCSSCVSKRirrHNSISVCTMVVTVlAMrvaU.IlANATESOTTSDPTPPTP^^ 

CPn_O063 78109 78267 

No robusc homo log present in Cen«bank/EKSI« as of 11/7/98 
PMYANCKHNCLCL'rDFSRHRS P PGLPLTFTPPVSFTU; IFIJSRCLSTSNIVLL 

CPn.0064 78340 78576 

No roDusc homolog present in Genebank/DiaL as of 11/7/98 
LVhmCtOCSAOYYRSRPAERAOTPPQPFIJ^RADFWEWiPRFSACCRVUXVAWVVl^ 

LFLFVMLLPLAW3SYLLAF 

CPn_0065 78882 80651 

CT2B8 hypothetical protein 

yDYYKY^mFFKKNY^f^DFPTHFKGPKrWIICVNP^ff•FERNPK^^VIOITAV^^ lALL 

SG IVLI ICTPLCAP I SMILCGCUASCXy^LFVOCri AT ILOARNSYKKAV^ 

ERPEUCAU)YSLDIJCEVWDLHHSWKHLKKU3UILSICrOR£VU«IKIO 

MISENYDACLKMrAYREELUCEQTQYQETRFNOMLTHRNKVU^XLSRi™ 

SUCFSTLSSRMSRIHTTTTVIIAI^VVSVMWAALIPCXIIIALPIUAVAISAC^^ 

LSVLVROILSNTKRNRODrmJFVKNVDIEIJWrm^RnXEMlJC^ 

0DWYT0YrTNAPrE3aaiE£IRVTYICEinAQTKK>OCn)IXFt*Eli^^ 

SETPin-QCKEFAKIJWaTSONrSTIYGPDNEMDPErSLPWMPKICEEEIDHSrXPVTKL 

EPGSREELLLVEGVmnXREIJJMRIAtJJOQOI^SVRKVmHPRGEKYGrW 

MLEGAFYNHLREAOEEITQSLGDLVDIONRILCinraCDSDSRTEEEPQE 

CPn_0066 80916 82655 

No rooufit homolog present in Genebank/CMBL as of 11/7/98 

G^nfMANPTOSRPPSPEISIEEr^EIJ«SSNTC^ISNTPPPSCAATAEEVSL^IBOCWl 

NSEDEEGPU3SCEVYIWVCITW0GDPEVRDHE\mVMYINGSCarroHECIUJ^ 

EPVR^IHNSCVGU:SCFLCIRNRtPPRr»mSOAIOARWNE^^IFAQ^A^mDW 

GGLYLOVAIXNSIYSHHILCVGIGSSYyiCXa^YRVHNYRVTGIVrTL^ 

LPYADSABGLFLPSVRCPSY0WALRCGE0CLIMDNNQQVGFTIPODSSSEIALVVNI2CDH 

STWTRL r EWIDRGDSOAVLELNPQPSHCRDIALTALYATTRISSLLQBCLMI SVTYAPEV 

FVTYAI VTCYS IKTLRYF I UXT^roPGCRRHFRVIJ^AAUaX3SU;FLTVUI«INVraJl 

WRRPPLISVIFCTASFATGSFrYVDLTWaTTSIJlSRU3LrV0RW.TXat^^ 

HUJSLRFSQNALITFHGGLFMPL r IGFFNOLVICJVPRWIRPrnTAVYDmQTSQEAWDS 

GDVlAIGCTTINFLIXWILLVINTFFFVRSVRroiLHRRPHR 

CPn_0067 82920 84053 

No robust homolog present in Genebank/EKBL as of 11/7/98 

KCSCYSYICPPMAVECRVNSSOAl^IXrOEVIJ^WCQSKCUJUirRILSW 

LIALTLASILTSVPYUOjCVFtirVTLGCIIFALCSEXIKKV^PrPrSHKEEirAWFEro 

KNICMEKEKEDPEHFGRTATDIPMRSAI^FNHSCHHIHESPALTETYRSKOCVLLFXDW 

CPVTLPDVTSEEEVLrRS\nA3SYLUI£AC:VPICVSMLIDELHNK^ 

ORKASFlJTOKOLATFFLAYTRVNCGHtJVPFRAGAKWrLIHyVRLRROHNC^ro^ 

CYYARLAFKQTORLYHOLFNVEXLRS lYANMDKDPLCHPWAFIPIYDLUCTEDHGtCFLE 

QOEDREYPSRAAODOFWC 

CPn_0068 84909 84331 

CT360 hypothec ical protein 

SFMIKKFFIYSLIFSCSFSAPLKCICNEDVSSOSRIEEDPEVLITOLNELIETPIEEGKE 
IRNEWAISDGOKSSEEIEESCGTSDSECI^EKTDKESSNEYVLDFrDSMVORLEGISKM 
COSGQVAO I IrCF^m£FDIR^JRElXLKNR^,EUlEKOL£FKKS IIX^WKEKV 
EODIKQTLMLUCK 

CPn_0069 85191 87086 

No robust homolog present in Genebank/EMBL as of 11/7/98 
LNFLYWLLIFNU;I^r^TPPPSRSSSPPPYCWIELODLCNT^INNSSRATPPPPEVGCELP 
PYFSASNFWTERGAPSLPSPOOLLSLPEYSROPPPGYFDETASITSRTSEEMFGTLVST 
LCCPANSERI*reDHEVNCIYIASTSDTOLEAVOGGMHITELRCEPVRVLYETCHLYAFAR 
ErrrCHSRLEVSHTVRA>frYFWDRFFSRHWNVCRRFLVFY0OCCAYV0AALDSSMH^ 
n*LCU;PTVYrRCt«'HVOHYRVRGFWPSCLDSLAACADrrSVUnfCESSDGIFYPSLFSH 
TFDNAI RYGERCLLVCSECMGMLPETQOOTSPLTSLECGHEVALVLNPOONPEALS I ASR 
LMHEERCGRLESNYMPGRSSNPFKTSMYVLVRLNTLAQI YLMSPYYSFOSND rVCLIF IS 
nAAVEr/SYirLT\n'DSTCGRRYLilVPRLVCTGIJWLALPTTLL£U.ILSYPRS\^^ 
NVRFI LC-^KCTTRWFFAWm-I UiWPFRCUWGIOLFVHRS r IGHTUMrTDLTLASMR 
YAXVFPr^IVCSCU.TALAHAKmiLALDPYRLIESGDLRRPAFNDODIOOADNPWDAYSI 
GLVINTC tYML [LFANLIFMVYSVRRYHRSRR 

CPn_0070 B7399 87208 

Mo- robur.t homolog present in Genebank/EMBL as ot 11/7/98 

YKVr:LFHLKNONFFSNOSRTYEORFPKVSPHFESrLPJjOSVGFSSCXrrLt.ISFRClTELKR 

fjLYI 

'.'Pn.OOV I HqO'i'; 8759? 

Tiav* tvp'^thor iciL procein 

I k::u<:: t lef rr ploharcxkkohk i ieelfpepfofdhlylklmensusrdafdkkrml 

KKNI .V/t X.V::OLYLYEWOOG [ LFFFTVriOkLMi CG I AS LFTEW5CETPST I LTCKPr 
KOI'.LTr-YL; : P :ni JJt? lEJLYMRMKO t AVQYLKPPCT 

'fttjin/.; H'llSi H«0S7 

'm2A iiy(xir.ht*r ic.il procuin 

1^* :YK.TrKT::wKeKvi, I LI YCLLFYFn!YP>*:rrpu:=cf; [r;p:;ooYVPOELFCORLG3SR 
:MA:;r ;u;p t v::pp i ::ALVALTDLKLVPYNOn.';F.'TWTTRLKNAVEK iglflqrnwk 

^ H .LY I ( J\WAL I LVCI 11 rrVALTLT r WUTVnLT. ICV^/n t FTATCLDKENKHRHVNSLWNL 
Ulltt; I IJJl Jjpfr rrnO l LLATM t as 1::AL tYAVPi^AVGLVICFn CCNOLJ imVYCARLGD 



E.\TYAr DRKAHKKP IGt IZ ^HO 1 1 K.M'. :NCKCLNAL : EI^mN^CTCPATANLlAS 
LKU4LN0PMPYCFrJ<PECCV-l ^3YL3LtfftfJ:;rCD 1 1 AftADOC IMTLSCTLOO IKKEPDRI 
£E3NH 

CPn_0073 89353 99574 

infA-tnitiatton Factor IF-i 

SMAICKEDTLVL£X:K■/EELLPC^mFRV:LENCMP^^'AHLCCKMRM3NIRU.'/CDRVTVTC 
AYOLTKARWYRHF 

tucA-Elungdt lun ructor Tu 

OJrOlSKETFORNKPHINrGTICHVDHGICTTLTAAITRAl^KLASFRDrSSIDWrPEE 

KARCiriNASKVEYCTPNRHYAHVDCPGHADYVXNMITCAAOMDGAILWSATOW^ 

KEHILIARaVCVPYIVVFLNK\WlSOEDAEL:DLVEHELSEU.EEKCY^ 

KAl^mANYIEKVRELMOAVDDNI PTPEREIDKPFLMP I EDVFSISCRCTmCRIERCr 

VKVSDKVOLVCLCmCTrVTGVEMFRKELPECRACEJAraLIXRGICWn^^ 

NSVKPHTKFKSAVYVLOKEBGGRHKPFFSCYRPOrFnirrOVTGV\mJ^^ 

VEUJVELIGTVALEEGMRFAIRBCGRT IGACT ISKINA 

CPIC0075 91087 91350 

secC-preprotein translocase 

SRSWFraCCX^KNRKALSRKIGTVXKQAKFACSFLOEIKXIEWSKKDLXKYIKVVLISZPG 
FGFAIYFVDLVLWCSITCLDGITTFLPC 

CPIU0076 91334 91903 

nusC-Transcriptional Anticermination 

OPPCSVNCHYWWVDVFTAOEKKN^KALECnCESSGMrDriQEIILPIEWMEVKl^ 
KWEKYIWPCYLLVKMHLTDESWLYVKSTAG I VEFLOGCVPVALSEDEVRSI LTDIEEKK 
SCVVDKHOFEVGSRVKINIX;VFVNFIGHVSEVrHDKCRLSVMVSirCREn^^ 
EEVAFGOESE 

CPn_0077 91956 92435 

rlll-Lll Ribcsoraal Protein 

FrVSYTLrVEVS0CKVRFSKSVKK\aKriKLOrPOGKANPAPPIGPAIXUA^^ 

EFNAATOOKPGDLLPmWADICrFTFrTKOPPVSSLIKKTLIJLESGSKIPNIU^^ 

TQAOVHAIAEOmnMOIVIXESAJCRHVSCn'AASMSZCVE 

CPruO078 92453 93160 

rll-Ll Rlbotocnal Protein 

SCRIKTKHCKRIRCIUamJFSKSYSUlEAIDrUCOCPPWIXrrV^ 
OOIRC»VIXPMyracrLRILVrASGNKVnC£^\rty^CADFMCSDDLVEKI 
TPCttCtEVCKLGKVLGPRWLMPTP K r C ' tV I ' l Uy AKAISELRKGKIEFXADIUCVCNVCVG 
KLSFESSQZKENIEALSSALIKAKPPAAKGQYLVSrriSSTMGPCZSIOTRCLMAS 

CPn.0079 93170 93688 

rllO-LlO Ribosooal Procein 

RGnOCQEmJLLOCVEDKZSAAOGFZLUlYIJlFTAAYSREFRNSLSGVSAEFTVI^^ 

FKAZEAACI£VT9C5I7ra:HUnA^SCCOPVSAAXOVIJ3FNKOHKDSLVFlAS 

GA£VEAVAKLPSIJC£LRQQVVGLFAAPK5O^^ZKNSVL5C^^ZSCVD0X^^ 

CPrv_0080 93720 94121 

rl7-L7/U2 Ribosomal Prot»in 

VRVTKVTrCSI£n.VEia.SNLTVLELSQLKKLL£EKWDVTA5AFWAV^ 
EFTETAVTlJEDVPADKKZCVIXVnmEVTCLALKEAKEKrECLPKnnC^ 
KLOQAGAKASFKCL 

CPIU0081 94219 98016 

rpoB-RNA Polymerase Beta 

FRCII^OKSRRT1lHLKCPERV5VKKlC£DZPDLPm.Z£ZOZKSYKOFL0ZGKLA£Qt^ 
GLEEVFREIFPtKSYNEATVLEYISYNLGVPKYSPEECZRRCZTYSVllJCVRFRLTD^ 
ZKEEE\rraCTlPtifrDKGTFZrNaAERVWSOVHRSPGZNFEOEKHSKCMZLrSFRZZPr 
RGSWLEAZFDINDLIY ZHI0RKKRRRXZLAZTFZRALGYS5DADZ ZEEPFTZCESStitSC 
XDFALLVGRZLADNZ ZOEASSLVYGKASEKX^AMLKRMLDACZASVKZAVDAKNKPI I 
KMLAKDPTDSYEAAUaSFVRRLRPGEPATLANARST ZMRUTDPKRVNLCRVGRYKLNRK 
LCFSIODEALSQVTUlKEDVZGAUCYLZRU<MCDEICACVDOrOTLANRftVRSVCELZON0 
CRSCLARMEKIVREPMmJDFSSDTLTPGKWSAKGUVSVUCDFrGRSOLSOFMDQTNPV 
A£LTHKRRL5ALGPGGLNRERAGFEVR0VHASKYGRICPZETPESPNZCt,ZTSL5SFAXI 
NEFCFZETPYRXVRDG IVTDEX EVMTADVEEECVZAOASASLOEyNMFTEPVCWVRYAGE 
AFEAIT^ST^^rHKDVSPKOLVSlVTCLZPFLEHDnANRAI>«SSN^CROAVPlilC^EAPVW 

tclecraakosgazwaeeix;vvofvix;yk\^aakhnptzkrtyhuckfu 
oopu:avgdvitkgdviaix;patdrgelalcwjvlvafmpwygynrecai z zseklzred 

AYTSZY r EEFELTARDTKLGKEEITRDI PNVSDEVLANLCEDC I ZRICAEVKPCDZLVCK 
ITPKSETEIAPEERLLRAIPGEKAADVKDASLTVPPCTEGVVMDVKVFSRKDRLSICSro 
LVEEAVHLKDLOKGYKNQVATLKTEYREKLGALLLNEKAPAAZ ZHRRTAEZWHEOLLFD 
OETinirEOEDLVDLLMPNCEKYEVUCCLLSDYETALORLElNYKTEVreHXRBCnADLDH 
GVIRQVKVrVASKRKLOVCDKMACRHCWCCWSKtVPEAIWPYLSNGETW 
SRMNUWVLITHLCVAAKTAG I YVCTPVFBCFPEORIWDfWl EOGLPEKKSFLYD^^ 
ERFDNKWICY I YMLKLSHU ADK I HARS ICP Y3LVTOOPU3CKAOMOC0RFCQfCVWAL 
CAYCVAHHLOE t LTVKSOOVSCRTR X YES Z VKGENLLRSGTPCSFNVLI KEMQGLCLOVR 
PKWDA 

CPn^0082 97992 102221 

rpoC'RNA Polymerase Beta* 

CSSYCRRRLKNDVLEK IMFCENSROtCVLSKECLFOKLEIC ZASDITIRDKWSCCEZKKP 
ET INYRTFK PEKOCLFCEK t PC PTKDWECCCCKYKK I KH KG Z VCORCCVEVTLSKVRRER 
MAH lELAVP IVH I WFFKTTPSR ICNVLGMTAGDLERVZYYEEYWt DPCKTOLTKICOtiN 
0AQYREVVEKMGKDAFVAK>O:£AIYDLLK3C0L0SUJC0LKERLRlCrKS00ARKKLAXR 
UCICBGn/SSSNHPEWMVLKNtPV*/PPDLRPL7PUX3CRFAT5DLNDLYRRVINRNNRLK 
AILRUCTPeVXVRNEKRHL0EAVDALFD^K:RHGKP^^C*JlGI^PUCSLSOaJ^^ 
NLLCKRVDYSCRSVI rVCPELKFNOCCLPKEMALELFEPPt IKRUCDOCiWriRSAKKM 
tORGAPEVWOVLEEI IKr.MPVI.L^JPAPT^.HRL^*tOAFEPVttBt;KA tRtHPCVCAAFMAD 
For W<AVHVPL3VEA0LF.AKVLMMAr'DN I FLP.'^'XIKPVA t PjIKDMrUTL^TfLKADPTYF 

pEFHty^KTK tFKDErEvuiALNNrr;F tDiJVF^;r/pRDETnR(: rii I hek IKVR rnrxjitETT 

pr^RVI.Ftrn t VPKELJFONY.'?MPr:KrM r;EL U/y;7KKWLEATVHFI.nnLKDLnF rOATKA 
A t :;Mr;r JtOVR I r-DI K:;il I LKDAYOrM r VKKV/DD( : I tTKJERI l: :KT I J IWTRVSEOLSD 
ALYVKt:;KOTRSKHMPLFLM tD:/y\P' WK:*^l.KOIi":ALRGLMAKrN( yv r lE.'TPITfTNFRE 

f;L-rAs:ir* i s.';HUAKKULA0TAr4CTAD:;f ;yi .ti' p.lvova^dv t itkku xttuui t e tSA t 
ca« K:EELLrLKDn n\mTVArov/or-':oK:;)M.t-Ay:;(:Dvi/isvoAF^iDOAi; ietikirs 

TLTl :H'; HW WTAKCWILNLAf* UiU If J* :i'J\\A I .* I AAOS f IKPt nVt.THItTKt 11/ 5r ;i AAT5 
:rri 'E I tTftlDG 1 LVTMDLRWI i /jE^HNI .V! J IKr 'Al JIVVr :DmilTLNh^KK U ^TTK.'; [ E 

rxwKPVEUWK I i.VADfTTPV: V /^H i ' : wr:i .)« 1 1 p I rtrnKDiF I KYfuiLvw ; i :rrEKW 
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HKNTCLVEL [VKOHftGEUIPQ rA tYDDAOL^EL MIPSCAr ISVEEGORVDPCMLLA 
RLPRCA CKTKD tTCGLPPVAELVEARKPEDAAD t AX rOCVVOFKC lOKNKRI LVVCDEMT 
- >^EEE«L r PLTKHLI VORGOSVI KO00LTDCL*/'/PHEI LEICCVRELOKYLWEVQEVYR 
LOGVOtNDKH EEE IVROMU;KVRITDPCDTrU.rCEtn/NKKEFYEENRRTEEDCCKPAOA 
VPVLLC rTKAr;LCTESFr3AA3F017rTRVI.T0AACCSKTDYLLCFKENVI»CHHI PCGTG 
r ETHKR IKOYLEKECEDLVFDFVSETECVC 

'rprjin^t-j 102206 103112 

jELUIEAiAAJG [ RONCODLCT 

•/ORAVFl^OLFEA>CGDKKRU.VKI PGTWEC I RAVEFLEAXC lAarrrLI FNLVOAIAAA 
K/\KATL I S PFVCR lYDWWI AAYCDEXT^SI DAOPCrVASVSNIYAYYKKFC r pro r MAASFR 
TKEOVlJUJ^DU:TISPKLU)EUCKSOHPVKKElI>PA£yUCKUW 
e0AKA7EKLAEX;iRIFACDT0tLETAITEFIK0IAAECA 

CPn.Q084 1033S6 103751 

predicted eerredoxin 

SEl«NKMDYKS0LVFSCPCCCKGNVCFSVFTJU3VILTCNVCSSTn'FDSVI 

LCKRIKDANS:LCNATVSVSVn»OlDrPF0LLFSRrPVVI^UttKKlAII«TJDAUi 

TSrLHOESDLIS 

CPP._00e5 104512 103766 

CT311 hypocnecical protein ^ 

FSMKFFILF IL I VAOFPATSAOPRTQVSASHSKOAKARRTSRIRSSAATriASVSRYKTRA 
AARKKIGKFEKKPSLS PVOWVRYSGKNYS IQTPSLW IDCKTOLPEKLDVIXIGKCKGN 
LTPTINIAOEITSKSSKEYrEEriAYHKAKEMTLESGirroiOSPSGEFTIIKTEKNSSW 
GRVFCLOATTVr DHTAY I rrSTATLDDYAELSFTFLKWSSFQIRCGKEATSGDAILEKA 
LEALONENK 

CPn_00B6 104898 105527 

acpE-ATP Synthase Subunit E 

NIHANLNAIXKUCOICDALRLDTUCPAmEAAALIiiNAKEOAiCRI lOEAOEEARKILCTA 
EERAHQKIKOGEVALSOACKIUUXAUCOAVENKIFRESLVEWtZHNnTDPEVSTK^ 
VOALEAtXIVSCNLTAYIGKHVSPRAVNEIXCKAVTITaJUaCSVVWSFV^^ 
WVLDLSSSALLEIFTRYLOKDFRQIIFOGS 

CPn_00B7 105540 106376 

CT309 hypothetical protein 

SHEKIFSrFKVVVMTOYYFLSSFLPTOt^ESVT'LFSISDLDDII.YLNLSENDI/3/YGIX^ 
RFFDFENF AFFWACKP I PFSFGEVTQENVERMLSSOQWSDCNOFEDFFKDFLMNHKSSOD 
RUmFSDLFREFISYHQTNSSKFWDYriU^QOQlJlVVLAGFRWVUfi^ 
DPVVLEVl>{OKDSPNYELPEEFSDIjOGVLDirfCLU>HTtWULALYOFHK^ 
DGNVILARCATYMFAIRNSLASVEKGREIINHIEKAIKW 

CPn_0088 106352 108145 

CT288 hypothetical protein 

SYRXCJmmrSEQTAQGHVIEAYGraiJlVRFDCYVROTEVAYVNVDmi^^ 
OEVKVOVFEDTOCSACRGALVTFSGHliEAELGPGU^ I^IX3^3^mL^^ 

KKVNAISDHNU^^^^^pvAS^CTTLI^«;DUiCTVTBGI^^ 

GTTOAKTWAKARDAOGKECAFTOVgRWPIKOATIEXSEKIPAHKIKDVGIJlILTO 

KOGTFCTPCPFGAGKTVlOHHl^KYAAVDmtCACCERACEVVEVMErP^ 

SL>0iPriX:iian'SSMPVAARESSIYIXr/TIA£nnfR0MGLDIIXIJa3STS 

RLEEIPGEEAFPAYLSSRIAArYERGGAITTKDCSECSLTICCAVSPAGCaJFEEPVTQST 

LAVVGAFCGLSKARADARRYPSIDPLISWSKYLNQVCOILnjCVSCWKAVKK^ 

GSEICKRMEVVGEEGVSMErWEIYUCAELYDFCYLOONArDPVDCYCPFEROIEl^ 

RI FDAKFVroSPDDARSFFLElUSSKI KTLNGUCFI^EEYHESKEVIVRIXEK^^ 

CPn_0089 108111 109466 

CT289 hypothetical protein 

UXrWKKQWYKWRKmyriYTKITDIKGNLrTVTAIXIARUJElJ^TrTRSDCRSSYA^^ 

DLKKVTLOVFCCTSGLSTGDKVTFLCRPME\n*FGSSlJXRRLrCrCKPIDN^ 

EIATPTF^n>VCRr^^RSMVTrI^JIPMIDV^^CLVKSOKIPIFSSSGEHK^iALLMRIA^ 

ADrWIGG^lCLTFVDVSF^VEESKKl>CFADKCVMFIHKAVDAPVBCVLVPm^^ 

AVEEWCNVLVLLTDtfTAFADAUCEIS ITMDOrPANRGYPGSLYSDLALRYEKAVEIAEGG 

SITLIT^aTMPSDDITHPVPD^^t^YITEGOFYU^^WRIDPFGSt^RU<OLVICKVTR£DH 

G0LA^^ALIRLYADSRKATERMA^C^KL5^WDKK1^FSELPETRU^SLEVNIPLEEALDI 

GWKILAOSfTSEEVGIKAOLINKYWPKACLSK 

CPn_0090 109439 110080 

atpD-ATP Synthase Subunit D 

VU0CSMSV0VKI,TKNSFRLEK0KU^U7rYLPTIJaKKALL0A£\^fWV^ 

OAYER lYAFAELFS I PLCTDCVEKSFEIOS IDNDFEN I ACVEVP rVREVTLFPASYSLLG 

TPrWL[7rMLSASKELVVKKVMAEVSKERLKILE£ELRAVSIRVNLrEKKLIPETTKILKK 

lAVFLSDRSITOVGOVKMAKKKIELRKARGDECV 

CPn_009I 110074 112053 

dtpr-ATP Synthase Subunit I 

'/RUI IHKYLF ICRNKADFFSASRELCWEFI SKKCF ITTEOCHRFVECLKVFDHLEAEYS 
LEALEFVKDESVSVEDI VSEVLTLNKEIKCLLCTVKALRKEI VRVKPLCAFS3SE I AELS 
RKTC ISUIFFYRTHKDNEOLEEDSPNVFYLSTAYNFDYYLVLGWDLPRDRVTEI EAPRS 
VNEU3VDLANL0RE IRNRS DRLCDLYAYRREVLRCLCNYDNEORtHOAKECCEDLFDCKV 
FAVACWVIVDR I KELOSLCNRYOI YMERVPVDPDET r PTYLENKCVG^WCEDLVOI YDTP 
AYSDKDPST WVFFAFV LFFSMIVNDACYCLLFLMSSLLFSWKFRRKMKFSKHLSRHLKMT 
A I UIUX: ICWGTTTTSFFGKSFSKTSWRErfS>frH'/LALKKAEYYLOMRPKAYKELTNEY 
PSLKAI RDPKAFLLATE ICSAG I ESRYWYDKFI DN I LMELALF ICWHLSLCHLRYLRY 
R YSO row I LFMVSAYLYVP t YLCTVSL I H YLFHVPYELCCO IGYYCMFCC IGLAWLAM I 
0(<r:WftfWEE t ICVIOVFGDVT^YLR I YAU;LACAMK;ATn^MCARLPHUi:3 1 VI LLCH 
.T/tm LG I MOC/ IMGLRLNF I EWYHYSFDGCGR PLR PLRK I VC3 EDAEAGCI HLDNNSI V 

':in_00*.:i 112121 11257 3 

iroK-ATf* :'.yrii.hjsp :;itbiinit K 

Mr':::;tj:: I Yi;|-UW[.LMLW\ tKNCTLGPVCO CAUIL-'r/GAALLVSim-X-KCC/riirOAYA 

h; ;: i y iKt. V/U « : i vr.: : F:;[ .fawfallll 

':i ii_0'»-M 112440 I I JO 1 5 

';r;<ii hyia>t tier kmi ncocein 

':KA:n/v;;Ai:r>:t/<uii.HOYMn;;vwoRu;L3NLFHCLLU-i.RYYY::KLvr^LT^ 

u:\xt /;::Kr::u';;:iTF.YVt :r'EY:;AAAOu; i EO:^cnDr/Y( •^<;wvTW.':LPr;RMnKCLPVT 

t.Yi wv/V':w:KVEKt;pYt-:vNO:;M;YRVYCLKr;LEYKEu> ; : t:;YHVAU:GGMOEIV:;RRH 



CPn_on'».t .. • :. c"- t-ri I'M II ii ntyt-* : , 
v.tiS-Valyl cRNA'>-eyneher. !»»•♦' 

Vrt'RVFl^BDHKFCLRtMTTEDFPKAYNFCCTEPELVVFW£KNr;MFKAEA330K?PYSVlM 
PPPT<VTCVtHMCHALWrLOCVLVRYKRHr/:Fr/CV f prmrtlAC t ATOAWERHLOASEC 
KR RTtTf SREOFLKH rWAWKEKSETA-L^OUlC :0"':>fDRKRrmCPLANRAWKArK^ 
l.FENCY tYRC*rrLVKWDP\rt/7rAIJODE\'r/EEKCr a-CYY I RYRMVCSQE3 rWATTRPE 

.v-Wrrrj(f; ■ :.rr'- v';r::; :. ■ - - r r \; :;.;vFKErv"Lr 

V. ; V J Y R5J/\V : £ ?V US KVWP/ J V. Ji:..- A wvi J i r '. l^-v. t» ; K L I- 1 KL'K VKNV UJWV'NH LRCW 

CrSROLWWCHR r PVWYHKr^CDER'v'LCVrCEG [ PESVACDFCSWYOOPOVUTTWrSSCLMP 

LTCLCWPDENSPDUCXrf PTAL:.^TCHD I LFFWrRMVUrSSMSGOCPrSEVmiCLI 

CKSYKRYNDFCEWSY I SCKEKlJVY»1CEALPXWAKWEKI^KSKCNVIDPLQ<IATyGT 

DAVRLTUrSCANRCEOICLDYftLFEEYKHFANKVWNGARFIFCHISDLOCWDU^ 

SLGLmFYIUXIFWLtHOLEEAYATYAFDKVATLAYEFFRNDlCSTYIEIIKPTLrCKO 

CNEASQSTKRTI^\^LItmjGVUlP\-APFITESLrUlI0t7ruy^PBCDGn^^ 

MLRSRACMEAPYPKAFDVKIPODUlESFTIJ«RLVVTrRNIRCEM0LOPIUi{LKAFW 

DrrreiOSCIPIUJALOGLESIOUJJKEPEKGLYSFGVVDTIRLGtFVPEOiUJCEiaaiLE 

KERVRLERAVENt£RU£DESFCOKANPNLWAKCEAU(NNR:EL0GILDIG-«FA 

CPn_0095 U5956 118790 

pknO-S/T Protein Kinase 

ACIVCLDREDORSLERYDnm I ICKCCHGEVY:JVYDPVCSRIO.'ALKKIRE0LAeiPLLKR 

RFUIEARI AADLIHPCWVPVYTIYSEKDPVYYTOPY I ECYTLKTLLKSVWQKESLSKELA 

EKTSVCAFX^ I FHKICCTIEYVHSRC I LHR0U(P1>* ILLCEJ"SEAVIIJ>CAAVACCE£E 

0LU)rDVSKEEVLSSRKriPCTIVCTPDYMAPERLLCHPASKSTOIYALGV\^Y0K^^ 

FPYWU«GKKIVmX)RrPSPQEVAPYREIPPFLSAVVHRMIAVDPOERYSSVTELra 

ESHUCGSPKWTLTTALPPKKSSSWmiEPILLSlrf^nftgySPASWYSLAISNIESFSEM 

RLEYTLSKKCLWBCFGILLPTSENAU3GOFYOC3YCmilIK£RTLSVSLVKNS^ 

ODLESDKrrri.IALEOHNHSLSIJVDCTTtaiHKWLPSRSGRVAIIVKC^ 

FESSCSUlVSCIAVPOAFt-^EKLYDRALVLYRRIAESFPGRKBGYEARFRACITVLEKAS 

TD»lN£OEFA£JUE£FSKIinX^AAPL£YI£KALVY0RU}CYN££IXSUXAX^^ 

iniIJ©H\ArYRU{ESFYKia)!UJ^VFMILVLErAPOAITPGOEEKILVWUCD 

LU3PTVLEIJlSSKKEIJ^.SYWSC^IPHU^SLFHRAW^XJSDVIl^LIEIFYVAC0^^ 

55CI0ZF7C£5I£D0KATEEIV£FSF£0LCArLFAI05IFNKEOAEKIFVSND0LSPItLV 

YirDLFANRAU£S0CEArF0AU3I.IRSKVPENrYKDYUWHEIRAHLMCRNEKALSTIF 

Ej;rrEKOUCDEOHEU^YCCYLALr(WAEAAKOHFDVCREDRrFPASUJWfl^^ 

KDAI^OE3lRII-IJU5KFLYFHCL£2«DERDLCQTKYHLLTEE^ 

CPn_0096 124347 118837 

CT296 hypothetical protein 

ETFI^ILREFF>(KSLPVYVSGIKVRNlJWSIHFNSEErVLLTE7rSCSGKSSIAFtya 
ACRKRYISTLPTFFATTrrrLPWKVEEIHGLSPTIAIKONHFSKYSHATVGSTTELFSH 
UIIJTLECQARDPKTKEVLOLYSKEKVl^IKELSBGVOISILAPU^ 
QCFTXVIOICTIHPIYSFLTSGtPEDCSWmDTLIKSENNIARUCNrei^Al^roBCH 
CSVLSDEEUfrFSTK0OIOCVTYTPLT0OLFSPHAL£5RCSLC0C5CXFZSZI»PLU 
■ NI^imJCCSFACa«CSSYLYKTrY0AIJ^DAI^FNI£TPWKDL5PEI0NIFUUa^^ 
VRLnjQTLCKXNLTYKVWRGNnjfDICDKVRYTTKPSRYLSK^^ 
SVATWEGKTFTETOMSL»WHVFFSICVKSPSr^IOElL0CLK0M*SFl.IOI^^ 
RALATLSGSEQEItTAIAXHLGG£I^ZTYZI^EPSXCUfPOQTE]aXGVIKXLRI^^ 
ILVEK£ERHZSlJU)RZZDIGPGACZFGG£VLFNGKPEOFLKNSSSLTAlCyLRQELTZPZP 
CSR£AFTSWUXTEATZK^aJaII^ZRLPIJ^tG^r^GVSCSGK55LZN^m.VPAZESm 
QENPraiLKFEMGCZCRLZHZTRDLPGRSORSI PLTYZKAFDOIRELFASQPRSLROGtTK 
AHFSFNOPOGACZOCQGLCrrKrZSODITrPIPCSECCX^YHSEVLEXLY&CaCFfZADXLCH 
TAYEA£KFFZ5HPKIHEKIKALC5UUJmj>LCRPLSTLSGGEZ0RLIUJ<HELlJ^ASP^^ 
7LYVLDEPTTCLHTHDZ0ALXE\nXSLTYLCHTVLVIEHN»*VV)CV^^ 
GCYLLASCrPKOLZOUTTPTAKALAPYIEGSLDZ PWICSEPPSSPfCSCDZLIKDAYQeOtL 
KKZDIALPRNSLZAIAGPGASGKHSLVFDXLYASGtrZAYAELFPPYXRQGLUCErPtPSV 
CE\nCCI^PVZ5VRKCSSSKRSYHTXASALGI^NGLEKrJAZLGEPFSPLTEEKL SK TTP Q 
TZ ZDS^XKSYKDDY^^'rrSPX PLCSOlXIFLOEKQKBCFZKLYSBCaCYDLDEWJL^ 
EPAZVZOHTKVrSPKNSSSIXSAZSVAFSLSSEIWZYZSOKXORKLSYSLGWKOKKCRLYP 
EZTKQLLSSDHPBGRCLTCaCRGEXLKZSLEEHKEKZAHYTPLEFFSLFFPKSYMKPVOK 
LUCDEllASOPUCrXTTKEriJJFCRGSSEFPGMNAIlJtEOUyrESDSPLZKPL^ 

CKCSCu^DYA^nnmz^INTsui5ZY0EDATFI£SFu^•rcTDryrRSZZ0DL^^ 

GLSYZTLC0R0im.5IX}EIIYRlJilJUCKXSZNLTNI\naJEEPl^UiPODt^ 
LVANNmVIATDRSCSLIPHADKAIFtjGPGSGPOOCFLMOSDrmrcPSVDLHANVPOTEV 
CPKAPLS ZSKANKTRCSDRTLICVNL^ IHH ZONUCVSAPLKALVAXGCVSGSCICrSLLLEC 
FKKOAEU.ZAKCTm'SDLVVZOSHPZASSORSDXSTYFDIAPSLRAFYASLTOAKALNI 
SSTMFSrrn'KQCOCSDCOGLCYOWIDRAFYAtEKRPCPTCSGrRXOPtJWJEVLYBCatHF^ 
EIXHTPZ ETVALRFPFIKKZOKPLKALLOZGLCYLP rCOKLSSLSVSEKTALKTAYFLYQ 
TPCTPTLFL ZDELFSSLDP IKKOHLPEKLRSLINSGHSV I YIDHDVKLLKSA0YCZEZGP 
CSCKQCGKLLFSGS PKOIYASKDSLLKKY rCNEELOS 

CPn.0097 124549 126006 

pyK- Pyruvate Kinase 

DSMITRTKI ICT IGPATNSPEMLAKLLOACMNVARLNFSHGSHETHCOAICFLKELRBOK 
RV PLAIMLDTKCPE I RLCNI POPI SVSOCOKLRLVSSD I DGGAECCVSLYPKC I FPFVPE 
CADVL I DDCY IHAVWSSEAOS LELEFMNSGLUCSHKSLG I RCVOVALPFMTEKDZ AOLK 
PGVEONMDWAASFVRYCEOIETMRKCLAOUGNPKMPI lAKIENRLGVENFSKIAKLADG 
m I ARGDLG I EI.SVVEVPNUJKMMAKVSRETCHFCVTATOMLESM I RNVLPTRAEVSDI A 
NA I YTCSSAVMLSGCTASGAHPVAAVK IMRSVILETEKNLSHDSFLKLDDSNSALOVSPY 
l^A tCLAC 10 1 AERAOAKALI VYTECOGG PMFLGKYRPKFP ItAVTPSTSVYYRLALEWC 
VY PMLTQESDRAVWRHOAC lYC I EOCI L^NYDR I LVLSRGACMEETNNLTUT IVNOILTC 
SEFPET 

CPn.OOOR 127494 i;>hO*it 

NO rotjiist homo log present in r.»^n>:t^wk/aiBL an or 11/7/9H 
r.VCKKFKOIKRTILEAPLYYLVr/; I lAL':RI(TI'R::Ft.Tr;iya(f:FnFU>FYT l.'JDYRKTAL 
TNLALAFPEKTFDERYK lARQI-'LOHLr TTLLi:!.! J^IEOL^AIN tDKLtTrVTlT.TRNPKCFS 

::F:Rvr.';riEDLEETFKNr/?EKO(:LtLF'.':ttn,\riWKi.i'n.Y tTKhfYM; r afaka [knoru;k 
K [ FAU'EVFKOK ( vppKf*; I FMs If J :K I ; r vi a/^mim: :: rm rLFt;:;pAFTTT3 

r-ALUAYrrCFrV I AVWr-ROAKCFFT/ 1 ;•: ;ak I ,YAMK: :f .1 MKK* VA r IHl AJtWf iri^EKC r A 

:XiiF/>^iHKRV«RKi::wtKKKYpy;:MM;yr/ivV;:::irK::KiyAiJua:p::frTTLHLAL 
t wAuiLRELoroFPir^':LtoiJ<Nncor;^i,itt' vf -a r ihi;nm*jm:(KHFurr'>ccA\rc 
::Ki*Kr.Er:;iJ5iiroAri.KN::i.i<tFY::Ktiij^OKi:i'KNi'KVK;:K':r-t<i/ivR 

Cfftjm'ri ij/*,27 \y\SU'> 

\\ki i«tl/.i:;r lunnold*! • r* < -t-iitk/hllhl. .i:: nl ll/7/*iK 

VYf *Kvrnjf. i::nFAAKiiAi'KitfiMi Ji.: \ vpnTi.-i'A/vri .i . i irv t r-CAfrrrrvoi km is ik 

t AVPAKSrAtriVATKAl.t): \VX^\\'SJJJ\ \1\ . I AA: ;K I IMrO:*. t KM I KKl 1.TKK 
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TPn.niOO 12M^J l27q(J2 

'.T'jTt hypor net ic.j I Dfon^ttn 

RTOKfaPtLLCLE:T^ltKrLJCLr:RHWPRr/'/.-LC-A: : IWIl.VO0SVrmTLTWFVR 
: VDLHPD(7m>CUJK3CmJXKVSLT iTOnC>7r/CCLR PSNLEWl SAANHTE^ 
KHtn.VSVDH E I N t RKD I HSVDA^ra I FVRLTOr/TED I tXTITKP ICSPPKCnf EVTIJ^ 
YLIJOKVSCPKEY I NALKEXJCLELTFNLNX 1 IA0C3HD£r r FP I PKEWKKI L 
rPrF>n-FMDL/JDFX3ADFLPLLFLKRErrPLWLNLPVFLFFPVTriOTMNPLEYSU3PVPP 

-.vv^r.v • v.':':;rrrv.r:-.vr\T':rT *-:•^^^ :rF*,r'rTL?:r: 
" . ••• : '::.ur : a:. ;:fMf!' :■"-="-" ;'-".A:.rTA;* :kt'..:i'. .'..'.F.'.yw.ir': w 
tkikettkLykkew 

C?n_0l01 1299B6 123141 

/dSP family hypothencal procein 

PSTLCNFSOVTTOCPSKTW PFD ITYVTTPtXE : : LIV-AWl^UJCrrW^ 
AFLTLFVLADKLHLPI IRilLMLHWNIAAIWFr IFOPEIRLALSRIRFKCIOCrFIDTOE 
OFVEOLAASIYOLSEROICALWLENXDSrDEVLSFSSVKIMATFSEELLEriFEPSSPL 
HWVIEJlGDILAYARVVLPLAHI7rTOL3RSM(7nWRAAI/;AS0RSDALIIT^^ 
SLSRDCLLTRGVKIDRFKAVLRSILSPKEHKRKPLFSWIWKR 

C?n_0l02 130099 131466 

cydA-CycocMrome Oxidase Subunic I 

FY rorWKFMDAL I LSRIOFGLF ITFHYLFVPLSMGLSMMLVIMECLYLVTKKO lYKOWTk 
rV/GIFALTFVLCT^ArrCIMOIFSFGSrMANFSEYTGNIFCrri^SEGVFAFFI^^ 
U.rGRHICVSKKMHFFSTCMVALGAHKSAfWI ICANSWMOTPSCYEMVMHKCKLIPALTSF 
VOA^SPTTrDRFIHAVUTTWLSCVFLVISVSAYYLWKKRHHEFAKOCaaiCrriCA 
•aOLWSADVTARCVAKNOPAKLAAFEC I FKTEEYTP IWAFGYVEMEKERVICLPI PGALS 
FLVHRNrKTPVrGLDOIPRDEWIWOAVFOLYHLMIha.WC;VHVALTLISWSAYKCWRWAL 
KPFFLVILTFSVU-PEICNECGV^AAEMGROPVWVQGUJOTCDAVSPIVOANOIVOSLV^ 
FSLVTIALLTLFITVLCKKIKHCPEEENDLTEFEVK 

CPn„0103 131465 132511 

cydfl -Cytochrome Oxidase Subunic II 

>raCIF«ELSLTSLLPLM^YVIU;VAWAYSFGDGFDL£MAVYLKAKEDKERRll^ 
PWDGNEVV^VI IVCCLFACFPACYATLLS IFYMP IV7rL\^YiniGCSLXFRSKSESVS 
WIFWDIIFICSGTAISFFICriVaJLIU^PLSPDTSYASLSWILrFRPYAALCGAWA 
SAFAIHCSCFAIillcrSDSLNARrAOQFPYILSSFLVrYVLFLGASLISIPKRFDArPTYP 
U.:tXIALTSCCCVAA3aSVSKKRYGYAFrYSTIJ^LI^L:LSAATLTFPNILLSTVDP0Y 
SYTIYNSAVirnCTLKSIil IVI. rCLPF I ITYTCYIYRVFKGKniFPS lY 

CPn_0104 133984 132676 

CTOl? hypochecical procein 

EKSMRMLOISMLLLALCTAINSPAIYAADSOSVSFPEQLPSSFTGEIKGNHVRMRL^ 

OCTI IREFSKGDLVAVIGESKDYY^/I SAPPCITGYVFRSFVU3NWBGEQVNVRt^ 

APVLVRLSRGTOIQPASOEPHCKWLEWLPSQCVFYVAKNFVANKGPIELYTQRBCOIOCI 

AMDLINSALWFAHIELEKSLNEIDLEAIYKKINLVOSEEFK0VPGIOCLIOKALEEIODA 

Yt-SKSLESOtTTSrASSOCSTPKVSSSEVTrSLLSRHIRKQTALKTAPLTOCRENLEYSLF 

RrWASM(XX^roHSEALTOEAFYRAEOKKXOVlACVLE^^^»HVVKNNPGDYl^^ 

FLVCTSINLEGWLGKRVTVECLPRPNNHFAFPAYYWGIKEAS 

CPiT.0105 1348B3 134029 

CT016 hypochecical procein 

YVPFRKFSNONPMIil YCKXKEI HUJWPQTAKIRTTPKI AMKVK INtJOL tCI PPFISARW 
SOIA^IESOECE^^aX^GTTJaiiLIDGKI IS I PNUX3S 1 1 DIAFOEHU.YLErSOSC 
RDDOKLCVGVUiTJVWOITKGKDIQVLPKNLISPLFSCTN^ 

0VLEKKADVIRVLSG^m7LU»RPEPHCNCMHCQIGRVK^^EED^IAVSDKDLTF^^^ 
OSGDKLYIVTNPLNPSDOFSVYLGPPIGCTCGEPNCEHIKAVLYT 

CPn.0106 135073 136374 

phoH-ATPase 

EKmOhOCKTMVIITrSVriYDPEALFSFENTR 1 1 1 PFPVXEEIXAFGKFRDESAKNASR^ 

LSNIRLIXENAKTKVTDGVU,PSGSELRI EVAPI^ODRRGKLLTLEl^I lAKREPMV^ 

VTKSLGRRVRAEALOIESRDYESKRFSFRSLYRCFRELOVSOEOIENFYXNCYLDLPLDV 

VSSPNEYFFMSACENHFAI^RYYVSECKIIAUCAMDKSVWGIKPLNTEQRCAUSLLLRDD 

VKL^m.IC0AG5CKTILAIJ<AA^1HKVFDKETYWCVLVSRPIVPMGRDIGFLPGLKm 

HV^PIYI>tmE^A.FSI^K>MGNSSEAWA^J^DAKKLEMEALTYIRCRSLPK^ 

LTPHEIKTIISRACKGTKIVLTCDPTOrDSLYFDEWSNGLTYLVCKFHHLAL'/GHMFKTR 

TERSELAAAAATIL 

CPn.Ol07 137321 136392 

cross hypochecical procein_l 

KKSPPPVTPKEIPTOPKPPIPORPEVSPTPTDHIVPCSIEASPILGKKPSPDSMVSPLSL 
FHKMLLENWTPVEEPFPWPPAEKNOK I FAWALNOSKLI FVSTSGNI AOPRLVTDSMSMMI 
WAANRTMSRIXACTNOVLSAAVSVDSWt^ORPLNPEROGTPLNEGECRACKtflWAIX; 
^WTCKQGKPHYLAOLLGPKA\mKHNKSOAAFDRCKNAYLNCFSLAOTLCVTFLOIP^.rSS 
G I YAPPENRKKPNSEENKVRMRW I HAVKCALVAAMOEFCNEPGNTDRRML I*/LTDLKTPA 
[TDPKKK3HL 

•JPn.OlOS n7FS7 137303 

CT013 

KNLFHYKA I LMS I FNEEVF I ISHRHTPU;aTSTALRNTPLVNPLHRTNLOft I ASY r P I FS 
TF rGrKTUCC:33L0YSMVlxrCNFSSVCKTLPCPEIYEELPKVRKEAWLEI FCIKALYY 
LVLGVl K 1 1 KLI VRYLCPCCRP PEPREPONPLTPTPLDMGQO IDAI FSTPT3 PTGFKDPF 
LODLLQEOXKKAPHL 

':r»n_0l09 IJ8646 141783 

I leS-Isoleucy I -CRNA Syncnecase 

nOKMTAOEVCKMSFAKKEEOVLKFWKtJNO I FEKSLONROGKTLYSFYrcPPFATCLPHYG 
liLt-\GTIKDWCRYATMlX:YYVPRRFCWDCHCVPVEYEVEK3L3t.TAPGArEDFCIASFN 
EECP.KtVFRY^/HEWEYY INR tCRWVOFSSrrWKTMDASFMECVWWVFOSLYNOOLVYBCTK 
WPF:TAl/TrPL:;NFEAJOWKEVDOPSLVVRHPUJNDCASLLVWTTTPWTLPf:NMAIAV 
• IirrLr^VR 10DKK:T.ZCM I L5Qt'/n/CRWF::NPEEFV r LELTSCKDLVCRrrEPPPTFFOS 
KREErwNFHVti\A3FVEE5ECTCWHMAPAFCB^DFLVCKENHVPLVCPVDAH':.';FTEEIP 
•jYC<yy^lKl^\DKEIlKFLKKBGRrFYnCTVKtlRYrFrvmTDTrLlYKAVNCV/r/AVEKrK 

I 'Kmu<an:;;: iiiwvPEii tOL>'RFCKWL[nAnDWAi:jRMnYwrrrp rriWKJAi/;EtL\A/r;2 r 

KF.t^L'f* rro tTDI nriNF t DDLN IVKC^'K PFim [ PYVFDCWFDf-ClAMPYAOt fHYPFENOK 
KTEEAFr'ADF f A&;i.OyTRCWFVTt.TV Ti^A I l.FDRPAFRNA [ VNG t [ LAEL/;riRMSKRl^ 
NV r-:;t'K WLC/rYr y^UAt,RI.YLLIIl.-WVKAEOLHFGrjKf : I ET ;vr,KO r LLPUTtrn^FFNTY 

AKt .Y' :Fni 'Kiv^n t lpaytk r ccw i ujni.y: wikvk f_* ;m: :y yi iLNKAVRpr/TF r ddltn 

V0f(l?k''IUUthVEAFJ7rrnHIO\AF;rrLYEVLTVFi:KV(ArFVrFt^\ED£Y0KLKLEKEPE3 
VI({rr/ri\V^>(0KI[.nni.EKRMlIDlRF.lV'CU;!l::LRKEMKfXVR0PtJ^NF-rr-fr::JKDRU3 



* * KTFECLI AEELMVXNV: ^rSV r't-*r.-KFNF*»MU:KKV'^;:KMK£:VCKALJELPNN 

AToKLiOEEi><vLTiDDPE:/^i>'Scvv::p»rrDr»;Y:ARo\;<\:.F3v:LXCLREPLrvE 

GI ARELVNK rrm«PNOG--C;-DRIAUl Ii^^^£A^,^.^RAI^L 
SOFQCENWDINGKATCIEI'T/GatDS * , ^ ' 

CPn.OUO X*:-^" l^^a^*^ 

lepB-Signjl Pepcida::^ l 

LSYPS IFWC0HYSLNK3PH :LRTr/KLLK5KKtAH5PADKKCLCELUaLEEAIFEHD0E 

Y «\- .^tr r*. — -'^^^.F•\r^v.^F^^•^CFV=■" Vr7/oT^.'tMl»PTT 

^:^•:•V:^v -rr?'-' H ■ - ii rArrKVPZLtP^K 

KK Y I KRCMUH F\JLr - V K : . --: : V ■ Awi-: I i" f :; V! ;.;i,;;^wVnV ; V : JF*J*7rrj JiJTi 

COKTI lOFKOn^OSYCRl : "?CTSMYCOFFDHKEWHODEP>«UCDPHLS PVSYADLFQC 
NYAMVrtllLTEHOARTSHL-PNPGSPTKVYLEICHTANl^YPKPUJUIYEKOI^PAIOP^ 
TU.PUlKEH[^LIRNNLr:S?F I VAOGCAYKYHOFKINTSG r AKAYAILLPKVroCCYEY 
SKCEAYO rCFCE I RYKUC33HPLTOLNOKOVI ELFNCCINFSS I YNPVNPLOAPLPKRYA 
FFlW^rt•YIraSP^^ rK^rcPTLOKFV^SETEKOEGSSC^JPY lATVDKCLPPEDFKnVE 
FIKNFCI0VPKCH\a,VU;:^'r/PMSADSR£Ft;FVPMENt^SPLCTFWPICRMC3a.^^ 
PTTLSCm.VSGIALATCL3LICYVYY0KRRJlLFPKKEEKNHKK 

CPn_Olll 144761 143934 

CT021 hypochecical procein 

OLQNRYPIMPNDSSTYFEP.rLOKYLMKKCXIKTLFLFLFLSFLFSTAFSGLFASaTSSlJtT 
IOENIFIJUCTCDYTVLSRCSORTnrt.VKSTTPKTWI EI IHFPCIAHKERPSLEOASWKT 
VIHQLESPSC3VrvvSLSSECS0FFSUmiTKSLEPVTGKSTn,TArL0irDLPLSPAPANV 
I KTKCKQJKPWSPKVSFBCA?t.TS ISVNAWQCI-WPKDRGPLSETGI IJWFTOPDISVrPL 
WVS lETPKCTS IVRAVDIGHGATS PYVYSLPOSKTO 

CPa_0112 144743 145093 

gacB-fPecll2) Glu CRNA Gin Amidocransferase (B Subunic) 

DSDFGVVNMKiam^PEYRQVLrVDSSTCYKFVCGSTYOSEKTEV^ 

SHPFFTGSKK^^mAEGRVDKFLKRYS^n^^OPAOOPQPEEnALPAAKCKKKVVnaaCK 

CPn_0I13 145329 146405 

pfrA-Pepcide Chain Releasing Faccor (RF-1) 

GFMKKICVAEYLNIUAEVEIKISNPEIFSNSKEYSALSKEHSYLLELKNAYDKILNL^^ 

ADDKOAIAIEKDPD<VVML£ECINENKVELEmJKZL£SU.VPPDPDDDLNVZMaJtACT 

GGEEAAU^VCIXrVRMYHLYASSKGWKYEVl^ASESDUarifK^^ 

GTHRVQRVPETET0GRVHTSAITIAVLPEPSEE13TEIiINEKDLKlDT^^ 

VTOSAVRITHLPTGVVVTC3DERSOHKNKDKAMRIUCARIRDA£H3KRHNEASAKRSA^ 

GSCDRSERIR^YNFSO^^lVTDHRIGLTLYNLDKVMECDI^PITTAMVSHAVHOLlMW 

CPa-0114 146371 147261 

hcmK-A/C Specific mechylase 

VMPTTSYSNMEIKKAIOEX^AYUJYYGVPLSDCEALYIl^LLEVSSIUKL^ 
MLKEYRIOUJVUlCORCPrAYLrCAVSFLGUaJlVDSR^ 

EIQTFYOICCCSGCLGtAIXKSCPWmrVl^OVCPOAVAVANENAKSNCLOVKIUJCn^ 
APYTRPADAr^O*PPYIOTIEIIHIDPEVRCYEPWKALVCGSTCL£FY0RrX0n.PIC^ 
5TGn/GWt£IGSS0GESIXNIFSta«:iYCRUlQDL5CRDRIFFLE2TORDPVSSGAYS 

CPn_0115 147279 148S22 

CCh-5ignal Recognicion Parcicle GTPase 

HINSLSQKI^SIFSrLVSSRRINEENISESIREVRIJaxOADVNYKWXDFZSK^^ 
GEEIWKHVSPOQOFIRCLHEELVAFLSDCREErTIOICrPS I ILLCCLOCAtaCTTTWUCLX 

^CHDFVItXTACRUlI0NCLKEELTAIOKVS0ANERUVKNVA^CODV^^ 
LTGNaLStfrOGDARACAVFSIKHVLGKPrKFECCCERIOOLRSFDPQSHAERZLCaCDrX 
rnn/K£H^IS£EEDA£l^KKLNnrAAFTYEDYYKOKKAFRRMGPL^^ 
SOKEIEDSEOQMKRTEAI I LSMTPEEWCELVELDMSRMKRIASCCGLTLCHJ^ 
OSKKFFKGHSXGKMEQVRXKMSQCNCWR 

CPn.0116 148592 148972 

rsl6-S16 Ribosomal Procein 

EKNVRRKSVAUCIRLRQQGRRNHVVYRLVIADVESPRICKY IEUj^^ 

£R I FYWLERGAQLSSKAEALVKQGAPCVYSAIXSKOEARKLVVRKKKRAYRORRSTQREC 

AAXDATK 

CPn.0117 148983 150071 

crmO-CRNA (guanine N-1) -Mechylcransferase 

TCMKID ILSLFPGYFOGPLCTS ILCRAIKORUJ>VOLTNUIDFCU;kWXOVDOT 

MU>*AEPVTSAIRSVRKEN3ICVIYLSP<X3AIXTAEKSREUVAASKLIUCCHYECIDERA 

I ESEVDEEISICOV^rt.^TJCG lAALVLIDAVSRF IPCVLGNOESAERDSLOJGLLBGPOYT 

RPREFEGKEVPEVLLOGDHXAI SQWFLEOSERRTYERRPOLYLNYLYKRS 1 OHKFDEETT 

TNRDHFKCDKISVVLEVNKUCRAKNFYCKVFCLDAMSCENKFCLPHEXJKTIFWLREW 

KKNIVTI^LSLiX:ACEEDFCYUJWW£LFGCKLL£K0AOEKAVWALA00U3CHAWZFSWH 

RMK 

CPn_0ll8 150075 150464 

rll9-Ll9 Ribosomal Procein 

KKENFRWY IHVNLUCELECEOCRhTOLPEFHVCiyr I RLATKI SBOCKERVOVFOGTVMARR 
OCCGCETVSLHRVAYGEGMEKSriXNSPR TVS I EIVKRCKVARARLYYLRCKTCKAAKVK 
EFVCPRSSKK 

CPn_0ll9 150520. 151 U4 

rnhB-Ribonuc lease Htl 

LMNTS r SEI0RFL3M I AFEKELVSEDFSWAC t DEACRCPLACPWASACI LPKClCVrPC 
VNDSKKLSPKORAOVRDAUlOOPEVCFnrGVISVERIDOVNIt.EATKEAMLOAlSSLPrS 
PDtLLVOGLYCPHOI PCKK I IQGOAKSAS IAAASILAKEHRO0LKU}LHRLYPEYGFORH 
KCY^T.':UIVEAIRRYCP^*r\■HRKSF.';PIKOMCAIV 

Cr'n_0l20 i*ill25 1*S1778 

qmk-'iMr Kinai;»» 

ErXFr:MKAN*yr;YCMNK r LVrrtPFrrpDM'JKCCI -K LFT r 3APAf;vr;KTTLVRMLB0EFSSAF 
AL-P l^V^T^<K PRECEVt\;Kl^YHfV::HEEFC'R( .r.OROALLEWVFLn tECYCTrjMLEIER IW 

r;t/:KiiAVAV ro [cx -\LF iKiJRMPrTV:: i f r Arrr;oEELERRuvnRC::EBf;.ooRKERLEHSL 

I KI JUKAWKOW I rWDUl J^WYRVIJC:: t F I AKU IR« f [. 
f.TOit (ivty.rit«;r it.-.it ;Mf.r.;Mi 

mtHl KK[)l<rfNKKLNKLr.*;:r'K;:LVHVA I KyAK rK lAKf ;rA/l*::: WA( LTLVLLDREni 

Oi'Km:K iy';TA::i'rvKKKK::i-ami::i'KKrii': :aytv;:;dvk 
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n\r,r's •Mi;rtuiii/l -'•-PNA :;vnr,n»;c.if:'» 

'.'KVMPUKVL :TJ.\LPYANCPU<Fr;H IAC7^rLPADVrAPFRRLU:DDVLY IC^SDEFOIAI 

rENHrSECLY-EOEORFLADRYVEGTCPRCCFDHARCDECOSCCADYEArOLICPKSKIS 
':VELVKKCTEIinVFLLDRMKDAU^Ft0CCaP0HVRKFVVDYtEHVRSRAITRDLSW3I 
PVPDFPCKVr^/WFDA P rCY I3CTMEWAA::0CNPDEWKRFVLECX:VEYV0FTaKI»n.PFH 
/.^/^/FPAMEtr^KLDYKK-/DALWr:EraLECRCF3KSECrrnn><DKFl^3YSU)KUlYVL 

- : ■• :.- ; " :r:"::;y v .\".\ekmiv-":. '".".'r.EOrrP 

. vv.;*..' ',- :.. ..*/,"■ ••;:*'!. "y/-: I'.v.'FrJCOAf'Wi' "r:-.i;?.vEA[ 

LFCAcVcOKLLALIJYPt'tPEJAVAIWEMIJPKSLENCNL^ 
FHUCSPRLUTTVE 

CPn 0123 155')75 153774 

recD-Exodeoxyribonuc lease V (Alpha Sutsunic) 

NSMEKicCTLEO ILVE^^CDSGD:TAY I Ki p^ncTTPILIKCKLPOPLEI/:sp rorvcvwsH 

3PSNTKYFQIHSYDSPU*YEYRCVFHYLTSKLIKGICPKIAEKIIEKTOaC^^ 
ERLSEVSCISETRCVSICKOLCEOKILRKTLLFLOEWIPIHYCVRIFKKYOEKSIEKIC 
EDPFIXAREMBC ICFKTADF lAMKLtr/PRNSESRLCAGIOHSLEELOEECHTCVPISLLI 
OWAKlXN0DVFErrP:Tt^Er3T0rL»1OKRKLIi{IODISCTU<WTRY 
LKRILFSSRRI RSIDCEKA I AWVEENl^IDLJ^ECWREAI KACFSEKIi: ITCGPGTCKST 
tTQAIUCIFEOVTHKI It^VAimSKAAKRKrEITOKHSVTIHAU^DFKTKSFRKNHDNP 
IDCOLIIVDESC>MDTHUJ<HFUCALPDYTTLVFIGDIHOLPSVC;PGNILKDLITShIKKr 
y I Rimi FROVHDSGI VTNAHRVNEGELP tLYSETCRRDFLFFOKDDOEEALNHI IHLVT 
KFVPOKYHIYPODIOVU^PhOCKCTTLGIYNU^KAUCHAI-NPKKANUiC 
IRNNYNKEVFNGDICY^/STINFEDKAVWTUlECKHVCYSFSEI^DLVtAyATSVHKy 
ESPCI I IPIHTSHF^O^LYRNU.YTAITRGKKLVILVCTKKAIAIATR^I^mVOHRCTGLA£ 
VUCELCrKKNYADL 

C?n_0124 156575 159068 

No robust homo log present in GenebanJc/EMBL as of 11/7/98 

rRSKORTVAITLCVt^IIXIASG:iFLAVAIPG[^SAVAIjCLGCGhn-Arxm/U.ITCLVL 

LIRSEKLJU-£0V£IKOARTRV^niErJ^Qt^QYVFr^ENVU3NL^^ 

Tmj:ODIEE IFLTlJUJIRNALDNEEFFWrHAKOCLAOVGESLFODAS IDEFINLAH 

RQHU)I^roPRWSMITKKVX^rvVRFIYVSTMYKQIKSN^EKSDFOOIJUC^^ 

VLYQSFOKGYNRAALLSEKTR I IHTSSLUiWEJa3ED!a{I^IKKECASHLENFKKFRTLFL 

GLSEEDVIDFTGASGWDCSKLPRKEVPOXriKKKIJlFKRTFADECfVC^ 

* QEEDPLDRLMDOVEOEATSVLKDODRYWKEIETSEAKFRSLPREDDFEKOSOIDSYIRDL 
DDHLSVWANOLSAAEDALIEVTDVQEHChmEMUCNIQQGLELIEDAVKATLPRVDFIOEL 
LEKFFTPLVAARMSLENS 

CPn_012S 153072 158605 

No robusc homolog present in Genebank/EMBL as of 11/7/98 
KISSCAEIMSEVKPLFLJ<^roSFDIATORFONLrNMWEQA£IYNEYEEKNARVONEIIC£Q 
KDFVKRCIEDFEARCLGVLKEELASLTRDFHDKAKAETSHLIECPCICFYYSIHOEEORO 
RQERWKMAERYRlXKaVLEAVQVEOKDMISSRVVVDDSYFEEEKEEQKVDN^^ 

CPa.0126 1S8B06 16I08S 

No robusc homo log present in Genebank/EMBL as of 11/7/98 

■ llvfsyycmglfffscaisscgllvslcvglclsvlcvloijj^ 

apdlloledaserljlvkasrslj^lpkeisolestirsaandt^iktwphkdorlvetv 

srklerijuu«3nymisei/:eise:leeeehhliij«:eslewigkslfstfij2«esf 

hi^evrpyiavndprlleiteeswewshfinvtsafkkaoilfknnehsrmkkklesvo 

elletfiykslkrsyrelcclsekmri ihdnplfpwvqdqqkyahaknefgeiarcleef 

ektffwldeecaisymk^roflnesronkksrvdrdyistkkialkdrartyakvl^^ 

pttbgkidwdaoraferosoefytlehtetkvrleaiwfsdijleatnvravrfto 

nandlkesfekidkervryqkeorlywetidrneoelreeigeslrlonrwcgvtiagyda 

GRUQ3LUlQWXKNLRD\rtJatLEIUTMDFEHEVSKSELCSVRAR^^ 

I EEU^EERCILP IRENLERAYLOYNKCSEILSKAKFFFPEDEOLCVSEANLREVCAOL 

KQVOGKCOERAOKFAIFEKKrOEOXSLIKEOVRSFDLAGVGFLKSEtXSIACNLYIKAW 

KES IPVDVPCMOLYYSYYEDNEAVVRNRLUWTERYONFKRSLNS IOn«D\^^ 

PEGHETRUCERELOETTLSCKKLKVAODRLSELESRLSRR 

CPn_0127 162152 161130 

ycfF-Cacionic Amino Acid Transporter 

ESFMFPSANOESRTRNVPLG IFHCLVACLYWG rVFV r PNFLCSFGDLDIVLTRYTIFG I F 
SLIACAI KNPSVIKKTPLY rWRKSLLWTLLINPVYYFG ITLC IRYVGSAITVVI ASLAPT 
AVLYHSWTKOKELPYSLLFAISSVI ITCVILTHLSALNLPTAASPLYS ILCVIAVI tSTS 
LWVIYVrRNQSlXEKHPNLTPDTMS'/LICISALI ICLPMI I ILDLCG ITHVTHNLISHTP 
GS ERIXFLLLCSAMGI FSSAKALX AWNKASLNLSPALLGAILI FEP IFGLVLTYLYSOSL 
PSLOEGICIFLMLCGSLUTLVLFGRKVOKSLENSQVSSSNE 

CPn_0128 162262 163053 

bpU-Biotin Protein Ligase 

EDRGRMLRNOVLVYCSECVS PYYLRHT IRFLKyYSTQBGAFDILRVDCNFt.r KNPFWEET 
TRLUVFPGGADRPYHRVLHCUrrAR : FOYVSECGNFLC ICACAYFGSKM I YFYEPECAPL 
OGARDLCFF PGTAKGFAYRCNFSYVS PSCVRVS POLFSDFGLGYAMFNCGCFFECSBCYP 

r^/^/I esrydolpckpas i vsr r vskglavlsgph i eylphycrmvkenvqktreflorer 

TTLDRYCONLVORLROPAFSKADC 

CPn_0l2'.» 163747 163064 

similaricy to CT036 

DEOY tUSH tHMDPR r FVTSEPLQKTYOKLOEKHVTJNLC lASOVaCTDLONKTOYENNLI E 
rrTNEITYYFPWHNPD I LRSEWDPI SNOLYL I FKKFF I HYHNLFSTALERNO I LLIDS L 
NTf J3SNP I AROMEU-AFLCVFEOLDYNEDErr I EPRDYFNRFVYKNSOTAPO lOSFCLLH 
EEMSYASNN IRNV-XTHS IVLCSPlLYOLITErDTTK I HA0OFDCL I 

;:Pn_OlJ() l(>^2Si 161751 

No rotjui;r Iwmotoa present in Cenetvink/EMBL .>b ll/7/'»8 
.::;rWKC:::: t rilENKKPACIXPESKFAA [TKIJSLAILGLFU; [AACl LrALSOLLrNTLLI 

lALT.Lt:: I tvr^:'n:t:^u.((Trcx::;Kr:vyKDEOKrK.:rFPKETpr:LDPWLLNPt-KNK iQii^ 

hTLLLDPT:; INLKNF.r.I* FP::rEEWKK I FLKDPDFL [ K:;ALANWK t LE 

• Tiijilil Ii.^l-ll l(.'/,Jifl 

»>. r'/hii::t (Mnth.h *■! i>r.>s,Mir i»t t;'.ri»!lMfiK/F>lMI. *'t ll/V/'iH 

;-;::XKKKRF::u:u\i ri.i t- FFT::AWFr:: [(:Fr.KLFM(;NAM::::::n'Yw:ivwi i.kt:;vao 
t•r/^rK^^:K(:I0VLI-r^::\mI.F^-L^:vf^\^•rFr•f:vLI7FVI;^^AU*1lAl::I.vI.Ft.LIRSV 
u::::Mvrji<i.wt *::kk. ;vAlJtuil^>I^ XKi.uvKk'/'A'i ;.U'::i'YiKVtiAi-wi*::. :i/M'EDP:;oaa 
v( .1 .( j:t wniMivi vhi/M .i.c: :i v»:kk' :ky i dpvi.hk i Kr<v:;i.(.vf-i . lAri-r jjOLNtX) 
. :vHi 't wiiNKKrr.i I- ( NKtOMi w Ki i^;rjr.Ki IE fM:::;!.*- r(v:vr'UOi-::M::ix)v:;oAMF:JVYRY 
( .i<'..iM/i ;(-f: :i :i j« ri » u i-Kt :riwni -i -virENPKnt ad: :dfli-.a. KrwKwiKF t jai;i:k 



ALLKNPQCt^rKDLKOFLV:- . - 

CPn 01 l^S.5P4- .;Ltkti5(il ' " "Z 

NO rooust noOTlCKi,.pr«senciJ-fv!:jLieneoaak/ErtBL 6:3 :i9TZ5a.' 
SM r EFAn^KTSVTACR : EDRMACRMNKI^LA ET3tr^/LI SSVC IK:G :LC rSCTVCTY 
AFWCr IFSVLALVAC/FFLYFFYFSSEEFKCASSOEFRFLP I PAWSALRSYEYISODA 
I NDV I KamOL3TL33-uLOPEAFFLEFPYFTIS LI VNHSMKEAORLSREAFLILLCEITWK 
tXTETKILFWLKDPNtTPOCFWKLLKDHFDUCDFKKRIATWtRKAYPEIRLPKKHCLOKS: 

..j^.. . . "RFp vtv7;;:r.r:r/p»«T r::,?ir,TyTr.Tvr?>r 

MKrr^i ' 

CPn_0133 167349 166564 

CHLPS hypothetical protein 

NSSAYMFKLLKNLFLICCCIVGYTVMRKES rVEOWLSNRUflW^VCRVSIRTSCIKIRH 
rC IHNPLASERFPYAAE r EYADWSS ISMLLTKOLEISELI IHGANFTlFPYTJSHGTICr 
NWSLVWKNFHPOKCTPSNLWI DRAPVLl RRCLFLNTRLYCLRANHKDI PHLSVPSLEFHS 
HTSSAKELPKLSEALPSLL'njVLEESLYHLNLPGO I IKPLSCOAHKHFYSSYPOrODRLN 
DrNTPGTPTEEI ICFIFCLFFH 

CPn_0l34 169131 167467 

groEL-HSP-60 

FADYRKLRKTTMAAKNIKYNEEARKKIHKGVKTIAEAVXVTLCPKCRH^ 
TKO^rrVAKEI EL£D^CHEN^CAO^f/KEVASK^ADKACDCTTTATVIJk^ 
GANPHDUCRGIDKAVKVWDELJCKISKP^A:HHKE^AOVATISANNDSEICNLIAEAMEKV 
GK^CSIT^rtEAKGFETVLDVVEG»JFNRCYLSSYFSTNPETOECVLEDALILI^^ 
IKDrUW}OVAESCRPr^I lAEHIECEAUVTLVVNRUlACFRVCAVXAPCFCD^^ 
EDIAI LTCXXJLVSEELCMKLEinTLAMLCKAKKVIVTKEITrT I\TOUa^ 
KKQIEDSTSDYDKEKLOERLAKLSGGVAVIRVGAATEI EKKEKKDRVODAOHATIAAVEE 
GIU«X7rALVRCrrri£AFLPMLANEDEArCTRriUCALTAPUC0IXSNACKEl» 
QVLARSANECYDALRDAYTCMI DAGI LOPTKVTRSALESAAS lACLLLTTEALIADIPEE 
KSSSAPAMPSAGMDY 

CPa_0135 169448 169143 

groES-lOKDa Chaperonm 

MSDQATTUirKPLGDRILVKREEEEATARCGriLPCTAKXKODRAEVLVLSrCiar^^ 
LLPFEVOVGDtlLMDKYACOEITIDDEEYVILOSSEIMAVLK 

CPa-0136 17U19 169569 

pepF-Ot igopept idase 

SPSHY0IEKPESLLELI^KKFSVERKI*D0LYIYAKLrHiXJDITNPECESDrOSIVn.rrL 

FSOEISWIOPALIAl^EEKVAAU^SSVIAPYRfYLEKirRLSPmxn*ANEElCIIASSFA 

AUfV5NKAF5SL5DA£IPFGIAKZ>S^X;£EHPLSHAIJV5LYM0SPD0EUU^'AYW 

YOYRNTFANUiJCKVOAKLFEAKARNrPSCIXASLFOHNIPTTWINLINrrKX^ 

RYFNUOCEAmJCEFHFYDVYAPISOTTSKNYSYEIXJVDLVCKSLLPlJCT^^ 

LSN»AmRYEMCHiaiSGAYSSGCYDSAPYILI^nmm.VDVSVIAHEA^ 

QPYHDAOYPLFLAEIASTFNIMLLKEALSKSDOSKEDKI VI inCTLDTI FATLTROTrrA 

AFEVTlHSAAEOCTPLTEEn^ATYGmXJKEFYCCVVTSDSWALEHARIPHFTnfNFYVY 

OYATCI IAAI^FA£KILTOEPGALELYTJCFUCSGRSDFPLNILKKSCU»frrSAPLDK*r 

AFinCKIDLLSSLLSED 

CPru0l37 172263 171502 

ybgl-ACR fanily 

VCSMNVADLLSKLEn^SKIFODYCPNCLQVGDPOTPVKKIAVAVTADLEriWV^ 

ANVLIVHHCIFWKGHPYPrTCWIHWlIQrXIEHNIQLIAYHLPUlAHPriXaO^ 

NWHDLKPFCSSLPYtXTWSFSPrDrDSFrDLLSOYYQAPLKCSALOCPSRVSSAALlSC 

CAYREl^SAATSQVrcFITCNFDEPAWSTALESNINFLAFGHTATEKVCPlCSlAEl^ 

FPISTTFIDTANPF 

CPn_013a 174094 172700 

*hemt-Glutamate- l*sefQialdehyde-2 . 1 -aminomucase' 
TWSIUXLAim}IXa*mWKLTKRNSHUCSWK>nVTFEEACOW 
VTPPIVSSAOCDIFUmKJREFrDFCCCVCALIHCHSHPKIVKAIQKTAlJCCTSYCLTSE 
EEItXATKLLSSLXIiCEHKrRrVSSCTEATKrAVRUUlGrTtmSIIIKriOGYHCKABT^ 
LCGISTTEET IDNLTSLIHTPSPHSLI.ISt.PYNNSO r LHHVMEALGPOVACI IFEPICAN 
^C IVLPKAEFU5DII ELCKRFCSt^ IMDEVVTGFRVAFC>3A0DI FMl^POITIYCKIlJOC 
GLPAAALVGHRS ILDHLMPECT I FOAGTMSGNFLAMATGHAA lOLCOSBCFYDHLSQLEA 
LrfSPIEEEIRSOGFPVSLWOGTMFSLFFTESAPTNFDEAKNSDVEKFQTTirSEVFONC 
VYLS PSPLEANFI SSAHTEENLTYAON 1 1 r DSLI KI FDSSAORFF 

CPn_0l39 174686 174093 

yqgE 

SPTKNKLRDIMKI PYARLEKCSLLVASPDIhW^ARSVI LLCEHSLNCSFCLILNKTLC 
FEISDDIFTFEKVSNHN I RFCMGGPLOANOMMLLHSCSEI PEOTLE rCPSVYLOGDLPFL 
OErASSESCPEINLCFGYSCWAGOLEKEFLSNCWFLAPCNKDYVFYSEPEDLWALVLKD 
LCGKYAGLSTVPONLLLN 

CPn.OUO 175140 174673 

yqdE 

PRSNOOKrFCMSLEKELLEETPL'/LLNrrKLVSFCN-/ACMXUrrEEKKFArYGHVSMOOA 
FOGAOTEOHSPORPFAHDLLNFVFSGFOrOVLRWINDYKDNVFYTRLFLBOKOTEFLYV 
VDVOAR PSDS I PLALTHK I P ILXIVKSVFDAWPYEE 

CPn.OUl 175817 175110 

rpiA-ftibose-5-P Iscflierase A 

KSSSAVEKOLHLHEKKCLAHEAATOVTCCMrLCLCGGSTAKEFIFALAUR COTES LAVHA 
t A:JS0f»3-/ALAK0LA I PUJJPEKFSSLOLTVDCADEVDPOLRM IKCGOGA I FREKILLRA 
AKRS I rL'/DESKLVrVLGKFRVPL£I3RFCRSAI lEEIRHtXSYECEWRLOOTCOLrrroS 
:;NY I Yt, I F3 PN.IY PNPEKDLLKt. 10 1 H^T/ 1 EVGr/T EKVEVWSSNSQCL IilKKYSV . 

rptijn^'/. i.vui 175^1*; 

No r'jN'i::r t)«>intili>ti prof:*Tnt m '>tnf;rvink/t->lBL .jr: or t(.7/'»S 
r;H.'r^:;K::K'TrLEKFHFKILOLLCTKNr: IUiFC:;HFE [rTKV.^HDNAtOK CH:;N'Pr>KPrAEN 
faNTU':FKDLKIDYrKOS;:KRFPFLY.':t';Prr'UIKUWKYF'/T 

'rpfijju: i::;47 \/f.:>\A 

"y»c I tiy(x>r.hot i'; -i I f-rM *rin 

i'ftrtP€fr::LKnPLK::iiFC^\rnFLRi'EiiLKKTPE:;LKD';:;L':Lix;Lwu:utAioDL[KK 
VKAA/ ;l:;k rT0i:EKitnAn>fitvc/Ft4w ;Ft k;*/' ihhI'ATEt.vffcx ;eram tnomxTOK i;:vr, 

IIHI+VWIFKFVKALEDKFTTAK'^Tr.PAI'AOt'LKOM I FMIH rEVTRKFYfTNOELrEOrVA 
» r/HKV f I'W.YDAtX'UYUM.IM^X.THO U:jifi'U*/f::M-fQ I UFy.t'Aom, lyovLL INNLV tAD 
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XEKTV^'U;L^T3JCTPTLDIKDEV t AR I flOAADY LPLERL:: w P0C0rA3CE ICNKt«TEE 
EQWAKVAI.VKE ICEEVWK 

r/EcVN^^EKFr-DAVSEALE^ 

riP'TLI^rrAVKDALr;RrPTV-/Er.E\mPKP5Pnt^LLRpAK0EA^ 



K0U'VL»1CAI.IACAKYRCEF£ERLK3VLKDVE:X;WEHIiriDEVKT. 
AANUJCPAIJ^CTTLHCICATTtiJEYOKYIEKOAALERRFOPIFVTEP^ 
EKYEirHCVRITECAI^VLLSYRYIPDRia.PDKAIDLIDEAASLIR^ 
KERElJ^IVKOEAIKREOSPSVQEEAnWCKSIDALREEtASLW^B^ 
KKNSIXSMKFSEEEAERVADYNRVAEUlYSLIPOLEEEIKODEASUJORa^^ 
RLIAOWAN;m:iPVOKHLECEAOT^Il^ESLEERWCWPFAVS^ 
PORPLCVn.FLCPTGWGKrElJVKAlJaJLLFr«EEA»C^ 

•/CYEBCWSLSEALRRRPYSV^A,FD£rEKADKEVLNIUJ0\ff^JCCILTOCKKRKW^^ 
^I^f^SNICSPEUU>YCSKKCSELTKEAIl^SP^rt.KRYLSPEFM^mIDEILPFV^^^ 
Dr\ACIV3I0KRRIA0RLKARRirre.SWDDSVrLF:^E0GYDSAfGARPUCRLl0QKWTl^ 
3KALLKGDIKPCTSIELTMAKEVI.VFKKVCTPS 

CPnj0145 180717 182369 

CT1I4 hypocnecicai protein _„,,^^, 
NCAASFIWLNKS3HR^nJlSPHFKSFIVRYMFVGGLVSFU,PIP0LECAIiNVT^^ 
VISROUCLOEDCOKFWNU3PYKLESLCAYOVLYHDDYSSKRIRELFPOI0KDEVM 
ILTLCKVDRGFSPEEISLIOKI^PGt^LASLRGSTEIDPNTDIJUtALVN^ 

RADYYSNCUJILAUlIHAERORYUWSPCVPCTSErHKATIEArOTILrYEEW^YP^^^ 
£MFSDEFSrI^SVTDRKFGVCLC7/SSLYFSLS0RI^LPLEAVTPPGHiriJlYCX3CpNIE 

TTAOGRHLPTASyCDCIXLEDU3VRTPEEMIGLTFMFKX3SFArX3KKKYKEAEEAYKK^ 

YLGDE£XO£U/:FV0ttJXKKKEGKSLIGKSPRAS0KGSVAYDYIJCCRlNrPTIJaXFS^ 

PGSNYEEIASYEEEIJCKAMKSSMPCCEDOmASVAF^LCKTAEAVAIiEKCVEDIPb^ 

SLHUax;KIU:DRHEYTKMiCyFIIAERLMEDOGrUaQNRSFAIJTfE^ 

ANTLLLMESER 

CPn_0l46 182595 183095 

NO robust homolog present in CenebanJt/EMBL as o£ 11/7/98 

I IVCISMSSSEV\rrOTVHGLGFCGLSSKSVWFKKSLSDAPRWCSILVLTLCLGALVCC 

lAITOCVPGVILMOGICArVUIMSLALSU^WLtCLFSl^ 

CCFSRAAPSGMCLPCDGSPRASTPSCLEELOAEIOAVTOAIDOMSDO 

CP1U0147 1S32I3 183671 

No robust homolog present in Gencbanic/EKBL as of 11/7/98 
HGGPMAVOSIKEAVTSAATSVCCVNCSREM PAFNTEERATSIARSVIAAI lAWAIStX 

gujlvvij^gccplgkaacaitmllcvaujwailitlrij:^ 

SATPPLBGGVAGEACRGCGSPLTQLDLNSGACS 

CPn,0148 183822 185702 

pknl-S/T Protein Kinase 

CICWWRVSSMESEKDIGAKFLmYRILYRKCQStWSEDLlJ^RFIKXRVT.IRU-LPDLGS 

SOP^^EA^HDVVVKIJUCI-^mPGILSI EKVSESBGRCFLVTQEODI P ILSLTCmJCS 

LTEl£IVDIVS0LASLLDYVHSEGIJ^EEMJLDSWIHILNCVPKVIU'DLCFA5LIIC^ 

lUXSFISDEErmESKIKERVUJn'SEGKQGREmYAFCAITYyUJ^FLTO 

FSDFIYI>roFLISSCI^CFKEERAKELFPLIRKKrLCEELON\ArrNCIESSUlEVro 

SSONLPOAVUCVCETKVSHOOKESAEHLEFVLVEACS I DEAMDTAI ESESSSCVEEEJGYS 

UUjOSLLVREPWSRYVEAEKEEPKPOPILTEKVLIEXXJEFSRGSVBGOROELPVHKVII. 

HSFFUJVHPVTNEQFIRYLECCCSEQDKYYNELIRLRDSRIQRRSGRLVIEPG YAKHP VA/' 

GVTWYGASGYAEWIGKRLPTEAEWEIAASGGVAAlJlYPCCEilEKSRANFrrArTrTTVMS 

YPPNPYGLY»lAGNWEW2OCWYGyDFYEISA0EPESPOGPAQGVyRVIJlGGCW^ 

LRCAHRHRNNPGAVNSTYGFRCAKNIN 

CPn.0U9 185706 187700 

dnlJ-ONA Ligase 

ERFMKEENSOAHYLAl^RELEDHDYSYYVUiRPRISDYEYI»?KIJlKIXEIERSHPEVnCV^ 
WS PSTRLCORPSGTFSWSHKEPMLS lANSYSKEELSEFFSRVEKSLCTSPRYTVELKI D 
GIAVAIRYEDRVLVQALSRGNCKOGEDITSNIRTIRSLPLRLPEDAPEFIEVRGEVFFSY 
STFOIINEKWLEKTIFANPRNAACGTLKLLSPOEVAKRKLEISIYTJLIAPGDWDSHYE 
NLORCLEWGFPVSGXPRLCSTPEEVI SVUCri ETERASLEME IDCAVI KVDSIASQRVLG 
ATGIOfYRWAIAYKYAPEEAETLLEDILVOVCRTCVLTPVAKLTPVLLSGSLVSRASLYNE 
□EIHRKDIRIGDTVCVAKGGEVIPKVVRVCREKRPEGSEVWNMPEFCPVCHSH\AmEEDR 
VSVRCVNPECVAGAI EKIRFFVCRGALNI DHLCVKV ITKLFELGLVHTCADLFOLTTEDL 
MOIPGIRERSARNILESIEOAKHVDLORFLVALCIPLIGICVATVLACHFCTLDRVISAT 
FEEU^LECIGEKVAHAIAEYFSDSTHLNEIKKMODIjGVCISPYHKSGSTCFGKAFVITC 
TLEX24SRLDAETAIRNCCCKVGSSVSK(7rDYV\M3NNPGSKLEKARKl£VSILD0EArrN 
LIHLE 

CPn_0l50 187759 192444 

CT147 hypothetical protein 

: I'nf KFFYSYNCPYFI SFr\/UjCVNKASSSNNSTKODG r PSWWPNVOWNRASCJVGDOEA 
NSLTPEAOTSRGWFSDRKHFLEVLCVSLEEMENNDLKK'^SRYKTIILIATLVTVArTCIV 
P ISMVFC I PMWVPCLI LFCACLSSAFLSHRUJSKCKEIH LRYRAYO lYROOLLSQYPDLR 
KSTLYKnfSITHVKPKKGFVGKLVENLRPDLHKNKDDGGAAADSRLOFAGYCVKHYQTDAL 
LCVSCVNSVeWORLASLIMSVKNDILNDVCSREPrDKAORSALWSGKDIOCEIOPOCIL 
DISRDtLAICCYCMNVGVEAKKArDOYKKWYLNSSTFIAWNPOLPAIAOSYLCEQORHLD 
VAAKIFOOLSALTTAHGTGOALEDLDSLLCYYDOLIESKCVGEKI IAS IHOKHLDCAMOD 
:X:C10EKLKKWSNLYHVFSITlKEFTECKLE0NEWSRI0RLRGKLEKSKCSrLGNCRTNA 
E*6-,TK5EKKLADYLL0 ICOREPFLTCMHKA CATGKAIQGKVECVISOHPEKOIKMLRCS I 
Er'LEGMLRREOWGArUJKNEOEVlJVLKGTMEAOt^FKDLVCTjrtnKYOEFKKfn^ 
r/DFTK:;Yr»NLLNRLEVUiAECrrDDLVLHVDRMSEDLKKTI EE tDCNLFOVTPEELSLL 
APerrO^II WJELPL IVOECNRLOEAirrnECVSOCLMLUCLLNROEK INKNt E3SRKNLVA 

t AKOAH; riAiitt r c:;(.v;lapliornpj^cloh Euw/LFMo:; z rn ihaldttetlvatssnm 

t*;:AMIITFrWN I Y*mt.L0VLEIOSKPAPAPMENPDLP';ALPEEVODAVAEDVr:GTHRLHHO 
VI r RRi-AtJI .KNM I : X) WK J t NtCWTMAKA I VLG [ VAVLF CVU :A £ F [CON t IT.LLILSCVC 
tl.LTOV^•l•l.rFrMU:;K:;KEFEKOVl^,\O^Lr^ATKar•CEFNNKDLNRLAKWDNl^^ 
.:t--':r'rWAKNlV::DI.aUPrK[-K::LKDLTKEFRKO::KMUIKRIKnRFKECU;OEAPVVRPT 
I l-jOIKt :AI-;VFAKUIHL:LU:HL0K0KEEX J rRGDAL70£PMCUXEK:;KYDMF.KAi*AAAMT 
K K'y»':KI A,iN I OK I A.1KNNLTYVR I ONFFRTL lOEKLGP.DT/OE I rJWK EAK EUtELAA 1 1 YG 

wr::f :k:vkoio\kkofkenvui I A(;Kr^r.ELu:/\vu'r/-rAa> 

I * ;akui :K^\Kfrri -v;riKh>iLKTu::u;YLTrFvnFM::PF„rr»j:'-nYNij i lkvreolfoeeorl 

tjt K/ITV: :i KI 'VAAVyAyNLAAWHKl 1 E^L t VrrTYl^t/iAOt? Vr::::KVTTLMftDUlAVEELV 

^■>1'^rr■rt•vf^t wrt;:in,- ti.mw\\:7/uirMLnDc:Dr.'ji'^ri i idwkklfei.lnnm'J^/npndpec 



r!vv:-:i-: 



OlfVHOrLLDAPVSL^ YOAF K iEFLUMr-EIJ; [.Vi3TK/\AEEEAKRYVEEKCRCFrrY 
SeSSoRLEAI AAELOOLRNOrrLLSrE I PLANLK :.-:rSDL.NU*EKVSVEKAAU:E£:0 
CIOTOYAEJIOCISBt^OKFEDWKiaXA^ r^SSVroKOK^JJ£;fr:^E 

AA 

CPn.0l51 1?4179 12^25 

mhoA-Monooxyoenase 

CYENl^HYPRASKADILVICANrTGLIL/VWLrOHCr^VKVIDHRASPEOPSr-^ 

„. . ^»;i(L' " ">i'A'vy'>Tr.r.rv"'*~^ '^''^r^r.^'rr'fc 

.'.'ur.jni.t'v.y'yp''" ' w "*■" ***:'! 'Tr ;. ' " \y*v v* .t* t ' : r '■'w '. 1 a. .' zatn 

NU)iRDLVK_vUi%AHn I NnLV i r' 1 :JC- r £=Xiii IHL^i* ITK.* Jr L.\FVn*NrCEKTK;.' 

LCU«nT{SI3PKLKCKLL*»TYNLVISDENFHIKT3HHAFPPEKGIA^U:SLSmXiS 

YUJCxJJmiHAAFNU^WKLLPVUCKAAUCHLVtTKEOEDCailLPYISPTTE^ 

RFYTPALKrfrLKCCRKFNTTCEEYYYPPHOALKYRSSDIIKMSPODKEIHCPCPGKRAI 

DARLEICSFL1DPLKSSKHU.I FFKDIPDLKEAIOEEYCEWI EICNVKEPRILNLYKANP 

NSLFIIRPDRYrCYRTHTFKLHELISYLLRIFASEKTS 

CPh_0152 195274 194318 

Cri49 hypothetical protein 

LIKKRKVAFLVSCIJSVAIGASAAPVRVPCFPOIPECLVOIKTEVCPKCEVCIAVTrKCO 
DHNLICTU^LPI«'PTPEXXFPT^A/LFHCFRGTKFGGLTGAYRKlCRKFAAVCIATU^ 
ACCGDSBCVAEOTIETYLROAOTILETl^EHPDt^YRLG : SGFSLCCHIAFELAKm 
PRDLNIKALSVWAPIAKXIIUJCELYENrSKHGECD I ISVXJKDPGFCPPPI IVCSCWOL 
LXRIODHVTANSLPTKPYILHQOGIOCTLVSRTQCTLFKNTAPGR^^•FrSYP^mW^ILAT 
APDLDMILDQIVSHFORTL 

CPru0153 195430 197892 

leuS-teucyl tRNA Synthetase 

rWYDPNLIEKKWQOFVKEHRSFOANEDEDKVKYYVUWFPYPSCACUIVCHLIOT 
rVARYiaUUWFSVUiPHGWDSrCU'AEOYAIRTOTPKVTTOKNIANriaQ 
EXaiEFATSDPOYYHWTOKLFtXLYOOCLAYMAmAVNYCPELGTVLSNEEVENGFS IBCG 
YPVnU<MLROWILKITAYADKIl^LI)AUWENVKOWKKWlGKSKUmr^^ 
LEAFTTRLDTLLCVSFLVIAPEHPDLDS IVSEEORDEVTAYVOESLRXSERDRISSVKTK 
TCVFTCNYAKHPITGNLLPVWI StJYVVUTYGTGVVMGVPAHDERDREFADirSLPIHEVI 
Dn«WCIHS^lyNDFClJCL^COEAKDYVINYLE>IRSrJCRAKT^fYRI^^ 
IPIIHTErCTHRPLEDDELPIXPPNIDDfVRPECPCSOGPLAKAODWVHiyOEICrcRPGCRE 
TYTMPCWSCWYYLRfCDAHNSOLPWSKEKESYVMPVDLY rOGAEHAVUai.YSRfWm 
\nrYWGLVSTPEPFKKLINOGLVLASSYR I PCKCYVS I EOVREeCTWISTCCEIVEVRO 
EI«SKS!aJ*K?roP0\rt.IEEYGW3AUWAMFSGPLDKNKTWSNBCV^ 
TSSEVODIEDRDCLVIJ^HKLVFRITEHI EKMStWri PSSFMEFt/TOFSiaPVYS 
AVRSrt^IAPHISEEIWILCNPPGITOAAWPOIDESYLVAOTVTFVAAJVrCK^^ 
AKEAPXEEVLSLSRSVVAKYLENAOIRKEIYVPNKLVNFVL 

CPIU0154 197874 199202 

9s«A-KD0 Transferase 

TSEFCPMMLRGVHRI FKCTYDVVLVCAFVI ALPKIiYKMLVYGKYiaCSLAVRPCLKKPHV 
PCBSPLVWFKC»SVCEVRIXLPVl£KrCEErPCWRCLVTSCTEUW^ASQVri PHGATV 
SIIJLDFSI IIKSWAKIJlPSLVVFSBCtXVLNFI EEAKRIGATTLVINGRISIDSSKSF 
KFUCRLCKNYFSPVDCFLLODEVOKORFLSLG I PEHKLQVTCNIKTYVAAaTALHLERET 
WRDRIJO^PTDSKLVILCSKHRSDACKWLPVVOKLIKBCVSVLWW 

VPLrFCPHtTSQSElJWRIXLSGAGLCU)EIEPIIDT>raFU.NNOEWEAYVOia^^ 
ACTASrORIWRALKSYIPLYKNS 

CPa_0155 199697 199488 

No robust homolog present in Genebantc/EKBL as oC 11/7/98 

I«££FCVPF1XKLKISLIPIEEMRHELFMXTKNSSSICFSN0EKGIRTY1TCSDII^^ 

YFLRSNINPN 

CPn_0156 200147 199770 

No robust homolog present in Genebank/EMBL as o£ 11/7/98 
LQCOKUJUWMMHNIVVt^EEPCRSArLGRTAFFPNKYPIAOCGWrPSTICNI^ 
FYFYRAATPQSDHPDGCCFILLERLKElJGAGFFYCDLRESNTTCFTLrFBCSNXCV^^ 
LFIROE 

CPa-0157 • 200753 200298 

No robust homolog present in Genebank/EMBL as ot 11/7/98 
FSFVTYK£yU^rY0rSPGASPNWOASL>lAOLNSYrCUX:ETVnTlIISLRPSCULAKKE 
KANAOTAEKIUCILSF ILFPLVLIALAI RYIXYNKFNKDLDRAVFF I PTEITKAEELI I A 
KNPALVKEAALTVSPLFYSLPKJCYOLMKVETP 

CPn.0158 201463 200894 

No robust homolog present in Genebank/EMBL as of 11/7/98 
PPKrrLSINIDLLLEDLOrrDSI PWPKLYLSEDFDFAYYPESKAI rmVAKLEKNNPCEEF 
CLESKKI LARYLLEOLFKLETCLNFPTST I DCGRESFLI EFSHETKKPTVWAFI YFYYYH 
SNGPKLEKDFKOAGCEVHNRLLNLGUC^RPOAGAONDGRNGGPYCP ICFLIVWEENYCSV 
UCDHGFIKON 

CPh-0159 201811 201467 

No robust homolog present in Cenebantc/EMBL as of 11/7/98 
CCPOGETATRI FSWTPSGFSLATEEIC/OVSTAEKVIKILALIFFPI ILIALAIRYFLHRX 
FDRKCFVIP0DTPKELELILAANPOLVEKAAREVHPGFFALPTKY0SMYI0T5KG 

CPn_Ol60 203794 202127 

pf kA- Fructose - 6 -P Phosphot rans cerase 

TVELLSLNKSYTEIORLRYRPE I LTLLETIRSKH IOETS5PPSPPPEL0KH I PNLCRIPE 
vniYTEOETSSKPLKrCVLLSCCOAF^HNWtCt.FOALRVPNPKTRLFCFIKGPLGLTR 
CL'rKDLDI^VIYDYYNMOGFDMLSGCP.EK IKTEBOKKNI LrnVKOLKLOCLLI ICCNNSN 
TDTAMLAriTLAHNCICTSV [CVpKTtCODLKNCWr ETSl/^FHTlTCRTYnEM ICNLAKDAL 
.TAKKYHHF t RLMOCCA::YTTt.ECCL&TLPN t AL I CEL t ATRK t iSt-KQLS EOLALCLVRRY 
KJCWWnrVL I PBCLI eh I FtrrPKL t OELNVLUVNCD:;.-: TEK l UlKLnPETUCTFHLFPK 
D I ANOt-U^RD.';irjJVRV::K t ATFXLLAVMVKKE r EK I K PMMEni:;Vf;i IFFHYEARACFP 
rjNFrrNYG r ALC I ISAl-FLVROrPrr/M tT [NNt JVySYTEWOf ;t;ATPI.YKMMHLENRLXrrE 

Tt'V rKTosvrpKr:f»Avom.[<xj:;D;;r:r.vr.nLYi'Frcpt/jYFt ;keel i dorpltliwenot 

H.';rFVACY:rriy:KR::i. 

f.T-fl.Olhl .Mllh'iH /0 1."tH 

ipf'j'liCCtM .icylCt.lfiiUl.t.j::'; t.iiiit ly) 

HRriSTSRKOEi'LL i*n ;wu:kh EOf'TMF: : I ;n J j/Nprrpv;i xj rri'u r/NprYfM V r 
I A:;i^KT»;sKR:;MVRtAoELTnu; [AALfrvuLi J :iK:DCBr;Ki>inK::i J -NYKv/ 

Hr:U Jl ! OOERLA I tt::::;u:tTrLAl/yPM-Ff NK l KAI JWWAIT r.-k ;Kli^AA^W^KNAPeVI 
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TM;:yKCAITYACHTLNPDPrrOFLKIOtVKELh NLPPrLYMOCEODLLVSimiRTL 
rTEAFANCDKP tTILTYPDVDHAFPFAESSALSCLTOWUCRELTSCE 

':?n_*Mf)2 205870 204803 

NO mouse homolo7 presenc In Geneoank/EMBL as of 11/7/98 
fVYTLYNIOSPFRIHKLYS r SSDVDTPWtFOLKSKVDSYLflXX:NR r KWSIVW 
rCKVE2JVll I^TIVKILK: LSFLIFPL aXAI-UJfYFUl\XYANHt^VSKILEJC\^ 
:B:y:CTX';HYK LTTLVFVTOKWLOAK:SNPLE:VEykALRTTKPSFFCn>^^ 

;.; ..\;:;"r:;.:..;v.:wiT:r: i:!.rrM::: ' . y.M^KPv: .•rr/„MLP'rrr.-:::r:\'-:KP 
• . : v:.:!i:V : .-^ ^ rrvnr. fvxL: Nirr: v ;p! .r:.;-; -vr?/ :v;-rtirTr '.rirr rwrcvrrR-r: 

PLDEDRCGCFEILEOLOELCVRFP rCPSOGPDNPNFOGFQG IRIYWEDSYQPNKEV 

CPrv.0163 205931 206394 , , , 

No roDusc homo log present in Cenebank/EMBI* as oc 11/7/98 

r EKAI VYCIKCXOI IKCIS I IHTPTPATPLCTBCEIFPGPVDSAIONDLERLLTVKKRPO 

1 1 REVUWGCSLVTTYPKBCORLRSPEOLRVU)OLVOSYPNHLHAIELDCGAI PODLIGA 

TVr ITFADFSTYILSLRSYOANSPSDDrrWCIWFCS IDDPVOAVISFUCDHCFALPSTLAQ 

DPLLCTNK 

CPn_0l64 206444 206998 

No roousc homolo^ presenc in Gencbank/EKBL as ol 11/7/98 
LCFKCIYrKIIFSFUCOUflTlSTIESSDSLCSRSFSOKLSVOTUOnrESRLKKlTSLVI 
AFLTLIVGGALIALACGCVLSFPLGl.ILCSVLVLrSS I YLVSCCKFFTLKEKIMTCSVKS 
•KINIWFEKORNKDIEKAI^PDU^ENKRNVGrmSARWt^ILHErnCI lUOlV^^ 
WFYL 

CPru0165 206983 207592 

No robust homoloa present in Cenebank/EMBL as o£ 11/7/98 
NVtXFMWrfVPKTIDKVDPESEr DIRKWSCYKL IKECOPCniSLISELL^ 
RSKY0EOARTiJ^DEDAPIJ*CLTRSYY0DGYLTPUlACPRDLINHYIHUUUtENP!(H^^ 
KHPCYYAi^IJ^ESVCVYREIJDIEIa.TK^^fVBGDySKEOEKNLQAILS^V^^ 
LtEHKETDLIGRCFTDVTCT 

CPa.0166 207594 207962 

No robust homolog present in Genebank/DfflL as of 11/7/99 
rCUCOYNKSDSIMSESIhmSIHI^ASTPFFIiaTWLCESRLVKrrSLVISUALVGACV^ 
LVVtrVAC ILPLLPVLIIXI ILITVLVU.FCLVI.EPyLI EKPSKIKELPKVDELSVVETD 
STL 

CPn_0167 208309 207977 

NO robust homolog present in GenebanJc/EMBL as of 11/7/98 

NLWSHFPRGFFMLPFCPTIUJaCPFLNSE^mSLERI^TVDSyFDLOSSOIVrLSKOI^ 

ir/EELSAKDRKFKPGSMNCTLVTEDPILPAHNSFSNCSDIOMRTPISPIH 

CPn_0168 208716 208417 

No robust homolog present in Cencbank/EWBL as of 11/7/98 

SYI^ILRRRENPEHFFNPCHPCYYARIAFNES^mrYWaJ7^TA^JC0My^^ 

LKSILSFVQILDEKDGFDDFLATHKDTTFIGRGGADXFCS 

CPn«0169 209537 208710 

No robust homolog present in Genebank/EKBL as of 11/7/98 
SFHIETTIGQIhOaNVCSECSOPLVMnjn^PlJWUIESRLVKITSrV'IAIX^ 
TAlJ«»GILSrLPWLVLCIVLWLCAUT*lJSYKrCPrKEIX^^ 
KDLOCATEXPEU^;£NRAE»^^^^SAJlS0VKETUUX:^^ 

TTODVDPN^EDSIRTVISCYKLIKACKPEFRSLISEUJyWJSGI^tXSRCSRVOERACT 
VSKKnAPLFCPTHSYYRDGVLTPLRAGPRVT INRAI 

CPn_0170 211098 210025 

No robust homolog present in Genebank/DIBL as of 11/7/99 

^AmKNHIIRGEKyNTCTVIAFVLSMSyI3TIJKNt^EDSVHKIC^ 

EAIIKNLPKADIHVHLPCTITPOLAWIUSVKNGFLKWSYNSVmmRI^^^ 

FRN^ODICHEKDPDLSVL0YNIL^nfDFT^SFDRVMATVOGHRFPPCX3IaNEEDUiIF^^ 

y}0CLDDrrrVVTEVQQNIRLAH\rt.YPSLPEKHARMKFYOILYRAS0TFSKHGITLRri^ 

FNKTFAP0I^m}EPA0EAVQWI-0EVDST^PGLFVGIOSAGS£SAPGACPKRLASGYRNAY 

DSCFtCEAHACEC lETRTIFSSAfCWPBCLIE ITRVTFSSLKRKOPSSLPIRVTCOLC 

CPn_017l 212444 211149 

•guaA-GMP Synthase 

I IKLOSARRHLOTIFIIXrGSOVTYVl^OVRKLFVYCEVLPWNISVOCLKERAPLG 1 1 L 
SGGPHSVYENKAPHLDPE I YKIjGI P ILAICYGMOLMARDFOGTVSPGVCEFGYTP rHLYP 
CELFKH r/DCESLCTE r RMSHRDHVTTIPECFNVIASTSOCS ISGI ENTKORLYGLOFH P 
EVSDSTPTGNKILXTFVOE ICSAPTLWNPLYlOODLVSKIOrrrVI EVFDEVAOSLDVCWL 
AOGTIYSDVIESSRSGHASEVIKSHHfAflGGLPKNUCUCLVEPIJlYLFKDEVRILGEALCL 
SS-^LLDPiiPFPGPGLT I RVICEI LPEYLAILRKADLI F I EELRKAKLYDKISOAFALFLP 
r KSVSVKGDCRS'/OYT I ALRAVESTDFKTCRWAYLPCDVLSSCSSR I INEZ PEVSRVVYO 
rSDKPPATIEWE 

CPn_0172 213237 212440 

"impD-Inosine 5' -monophosphase dehydrogenase (COOH-cerrainal 
region only) 

APIGAAIGZCPLCI3RAHHLVEACAhWLVrin-AHAHSKCVF0TVLEIKS0FP0ISL\A;CN 
LVTAEAAVStAE IC*/DAVKVG I GPCS ICTTR IVSGVCTTPO ITAITNVAKAUCNSAVTVI A 
a:RIaYSCDVVKAIJKAGAIX:VMLCSLLAGTOEAPGDIVSIDEKLFKRYRC^CSUyW^KCXI 

.':adryfotocokklvpgcveglvaykgsvhdvlyo r lcg r rsgmgyvcaetlkouctkas 

r/RITESCRAESHIHNIYKVOPTLNY 
TPn.Oni 214041 213715 

No rob<jr.t homolog presenc in Genebank/EMBL as ot 11/7/98 

T I FDLI YKIDSYKH00GFMDFSVFPDRFVESTSP.':P : EOIDAKTLVSNCCKYCSRCLFI F 

UXUII t r':K::WTrSGETASLVFCILGLIVLVLL£ lECRNRECCRRIS 

'.Hi,OrM 21421*5 2M724 

tUt in^tjric huinolot) fttHsent in CHneb>iriK/EMCL .ij ot 11/7/98 

K t F r wr/KK IV r i.:;m imtp t niis r j palnpel-=;li p prrLv-r/rrcr jla-h" i paocrrs 
TLftHLOiFi I rt/;LATi i:n'FrvtFFLwnLNLL:rrr:;i t=:::;(:L[ ivciXFLrMCLYFM 
: JUViLWLujKELOJAbiEn EEEY tOE c ealrcafiue:; rTt:::paTWL 

'•r'n.mv, 2MH',r, .11^275 

f^. t»i!.<i.-.r luittKjUM pre::tMir tn < ;^ti»tlMnk/EMI;L .i:: of ll/7/nn 

LI J JtfTFOFri JikPC^EC iwn/ lODTTTVLYALNSFDr'RLJLD-n IREjGKOSPLEAENALGE 

Ki rr:r,r/i7t::Fn.EKVAf riLpf:YiiPKFYLSFiDROC/jnvHYevLix;vFLKTVAACi tens 



FLTDGMSPEIXGEVKEALK. 

CPn 017»i ♦«.,i.l,627fc „iJ*.5l»J 

CT153 hypothet'ieial protein.;: ... 
NDDDPHDESDCEEASKDSAFSASFSYErVKGSTREIJKNnTI rrTTASRTLY I LROOCSYDP 
RAUC^TODEFRYWVEKRIiiAK^^P0SLNA^VK^/CTHYVA5VTY0G ICFOVLKMSYLOVEEL 
EKEK tS ISVAAASSU.KSKTGNATEKCYSSY0SESSA(>rVFLCCTVLPDL0ODKLOFXDW 
SE3IP^rcPI PlAir^-^SITCLI I PELFFSEDAOVt^KKSALGCVILNYLESHKPKEECP 
./-p,.- - « • AfA'**"".*''^^/* T rv.-i — ' f^!^' .KTT'*'^A? PL^ffU* ? 

: • • '-v.-f ir .vi tr rr. " • : : a*;t'" t • , ■• ■ :: . pyha: . * ■ ■■- : ■ wi vav- iykdrcTaT 
UOCLNTTGOU-' I fl JTJOE I ft ^1 i ;K V UV:T JMJ L> ^ K L JTTJTJU JVF : m 

CPn.0177 217513 216608 

No robust homolog present in Gencbank/EMBC as of 11/7/98 

DKREQTKSKFIFtlSEESMKOPMSLIFSSVCUILCLCSWSCNOKPSWNYWrSTSCE^ 

VHCrnCSVSOLPHYPSAFFrrOIFSEEHNDPYWAKTOEESRKIWREIHKNLKIKCSYIPI 

STYCSLMHPKSAALTUCTYRPHPrWINGYERSFNirnCKYLKNGSRRRTSHKP^^ 

NLIKSSCRRCmiCLEHreEDFVIARRRBCVYSLYPVEVCSYPOCNPFVIAYAWXADESA 

CSKEVLPVKCYYSLVWESVSSSOSUiAFCDSFAEDYUlSTFLANCTSIUn^ 

OP 

CPn.0178 218052 217789 

NO robust homolog present in Genebank/EMBL as of 11/7/98 
VKEyU3FLVCR^WERDPa^CRHCTVSCKFCCESIDAlCrr^CQLFHIAG^ 
ES ILKOLLALCI ITCYENREREVWVYLD 

CPn_0179 218550 219056 

No robust homolog present in Genebank/Q4BL as of 11/7/98 
PKIWimfrETRIEATSVPKFhmRLRKSFHKSCRSSRPSKACVANrFNrrLOACRSCIIPG 
iCKAILUJV^mA^^PNYSCIFESIC^FlJE0OLEAQKN00AALVWCIUC^ 
LPRSLKKDRKFMSSLIFTKLSYALDLSAPMHLECKPNLSYEEKLO 

CPtu.0180 218963 218355 

No robust homolog present in Genebank/EMBL as of 11/7/98 

TSLHKrUXnCYKPV^IO^m^ASETYPS0rLKAOR£VRnAYFNOAa:HPAiUWILE^ 

CLLDVYHTNHYSVrrrCVONYPNLRFTFVSSKNNIKNGL^ 

tJU^CKIRNIEVPKVVGLI)UlSCILISKIXLKOPO^OSLTEDFV^mSTNOEEARV^ 

LI5LILLCXQAVL£SrOEK)CRS5 

CPa.0181 219175 218777 

NO robust homolog present in Genebank/OfflL as of 11/7/98 
VHELrKIDCVYYFrKKF>KlJYhWYSLWSHKEKPSSLEKAVOALDSYryW^ 
DDISREIYC\mJU.YIRFVrtVSISOSLSRIPWRLKRILLRYCTLRGKYVMPILIKRIAILL 
GLZRFSRLRK5VY 

CPn_0l82 220704 219334 

accC-Biotin Carboxylase 

RCMOCVLIANRCEIAVRI IRACHOLCLSTVAVYSLADOEALHVLLADEAICIGEPOXAK 

SYIJCISNIIJW^CEITCADAVHPCYGrLSENANFASICESCGLTFIGPSSESIAmSOKIA 

AKSUVKKXKCPVIPGSEGIIEDESECLKIAEKICFPmKAVAGCXXatGZRIVXSia}^ 

RAFSAARAEAEACFTWPNVYIEKTrENPRHLEIOVICOrrHGNyVHLCEROT 

IEErPSPILNAEIRVKW3JCVAVDLARSACYFSVGTVmjJ3KDKKrmt^^ 

rrEEVTCIDLVKEOIHVAMGNKLPWKOKNIErSCHI lOCRINAEDPTONTSPSPCaiLOrf 

LPPACPSZRVDGACYSCrrAIPnnrDSMIAKVIAKGlQlREEAZAIMKRALKEntZaCM^ 

IPFKOFKLONPKrLESNYDZNYZDNLLAQGNSFFXEF 

CPruOI83 221207 220695 

accB-Biotin Carboxyl Carrier Protein 

RHLC34DLX0ICKLMIAMCRNCmtFAIKR£Xn.ELEL£RI7rR£GNIU}ePVrr05RLF^ 

QERPIPTDPKKDTI Ki^l ' l ' iLN SgrSTTTSSCDriSSPLVGTrYGSPAPDSPSFVKPGDlV 

SEOTIVCrVIJtfOCVWJEVKAGMSGRVLEVLITNGDPVQFCSKLFRIAKnAS 

CPn_0194 221814 221221 

efp-Elongation Factor P 

QWKIKFCCCEEXIMVLSSOLSVGMFrSTKDGLYKVTSVSKVACPKGESFIKVALOAADSD 
WIERKnCATOEVKEAOFCTRTLEYLYLEDESYLFLOLGNYEKLFlPOEIMKIWFLFLKX 
GVTVSAMVYWAr^ffS^^PHFLEL^^VSKTDFPCDSLSLSCGVKKAL^.ETXSIEVMVP 
ICDVIKIErrRTCEYIORV- _ 

CPn.0185 222457 221765 

rpe/araO-Ribulose-P Epinerase 

AEVKKOESVL^A:PS rMGADLTCLCVTJUOCLEOACSDFIH IOI^!IXmFVPNLTFCPGI lAA 
INRSTOLFLEVHAMI YNPFEFIESFVRSCADR I IVHFEASEDI KELLSYIKKCGVOACLA 
FSPDTS I EFLPSFLPFCDVV\rt^tSVy PCFTCOSFLPNTIEKIAFARHA tKTLCLKDSCLI 
EVIXXSIDQOSAPLCR0AGADILVTASYLFEADSLAMEDKILLLP.CENVGVK 

CPn_0186 222878 224068 

•similarity to Cps IncA 

P I KOK I LMSSPVNWTPSAPNI PI PAPTTPG r PTTKPRSSFI EKVI t VAKYILFAI AATSC 
ALCTILCLSCALTPGICIALLVIFrVSMVLLGLtLKDSISOGEERRLREEVSRFTSENOR 
UTVITTTLETEVKOLKAAKOOLTLEI EAFRNENGNUCTTAEDLEEOVSKLSEOLEALERl 
NOLrOANAGDAOEISSELKKLISCWDSKWEOIfnrSIOAUaOLCOD^eAO^TYfVXAMO 
EO lOALOAE I LCMHNOSTALOKSVENLLVODOALTRWGELLESENKLSOACSALROEIE 
KLAOHETSLOORIDAMLAOEONLAEOVTALEKMKOEAOKAESEFtACVRDRTFGRRETPP 
PTTPWECDESOEEDBCGTPPVSQPSSPVDRATCDCO 

CPn_0l97 224218 225045 

predicted methylase 

VFUTYTRTLPMHSK FLGBRKKNSrSHKEETrW: r ASriYtiK { VQOKCHYYItRETI LPOLLP 
JLTLCGKSSVLDICCCOnFLERALPKECRYLC ID [i^-TPL lALAKKMRnVNSUOFKVADLS 
KRLEFVEPTLFSHAVAILSLONMEFPCFJV I RNTATLLEPCCOFF IVIMiPlTFR I PRASSW 
H YDQiKKA t SRH I ORYLSPMK I P IMAItrr •,OKD.';P:rrL.':FMFPU';WFKELL-nifr.FLVSCL 

EFwr;;nKT.'7iT;KRAKAENu:RKEFPt.Ft I :;c I K I K 
iTl 52 hypochi:c pror.oiii 

KTUKnMFRKLFPF::KKKTCOKORU<NNt:i,I/jAI tOr:iKVt.UIHKA::KFJU:Vtj:YYGLL 

'iivi-r LVFFLRLiVtiLFTNLNWKEwi.i I KKitz/KKPiVA ivF^YHATh::;N u :lv[,w;:;f 

fVfi'.VfAi : I LMr.U;LEDr;LNK I FPT:,WP(* I : :t riM .V: :Yr/ ttlv.':! t* l K 1 1 Vt .f ^* l Y CTO 
(MP IOYAKLF:;LJn:;HTALVF I :;RVV( Yt.LI.YI^a^Y;f ;YA/-I.f'l'yA tUKT::AI.1 :7TU IC 

: :vw t vF(;kaff;;u*v.': I FUYcrrft yvLVAu: :Fr .li .uy i ttm i Yi.rt t yvr.TF r lONRf rr 
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;;KNKENLTLJE : ARR I K 

CPn 01*}V 226391 227825 

'Till nomolon-fPossiOle Transmemurane Procem) ^^^rrrrw 

ITSKOtHAWSYAKIPUSITKWKHIEITSOAOt^AWKDPN^^ 

KFSOIRYSSSTVl^PSHUWLISIDNKKHLWRLO^^ 

PLDVAr^SLNIEITIYKNAHLEADAIUlNPUJaSCSMS^^ 

ILSPHFSYAEARFSGKAOITITrNUTPKFSQICrTW^ 

LrHC0FCSLPI^LVS^mLAPFHUCJaTFSFKTDOTCFV5prt^ALI^ 

r PDLSsUDESSTSPSSKDUCIQCSCEIFSLPUJSrn^^ 

WYNPKMNKLTLLSJmCSEALLSELKLVMDFSMKL^ 

FI r^lDFKCSLRAN^tt^AKIEVCLKCSCtAPRODSKTLAEFSLmOTO^^^ 
LiwrHrPSSFrAS:iPMSPGLKA0ISStJ\GPRrNVSrKNAFRFGEGPVOI^ 

tRWSFECTRI0SATLl»CKISIA^m^^MyALFOFLDITD^^ 

UCKRIJJALIDRRIRIJ^ICrDIAHDRLFWrWIOPEVIKKyFHNK^ 

CSlSpE^SAYARIAIiKSYSt^PFSSt^KLFSSI^DSTPP^ 

SIENK 

CPn.0190 229901 231274 *,w^/ao 

No 7obusc homolog presenc in <>ne^*^/EKBL as of 11/7/99 
LIX;im<RXRHSFDSTSTKKEAVSKAIOKIIKIMETroPSt^ 
KOKLSKQAEOLCU^CSOETLSNLEMTfUSUCI^IGSVIE 

SlSKRICKTHKKVATHDVNISSWnmHSVACWRFRGKID^^ 

SKDYIXRYVSSOLTIDKVEDKPrTKP^KGKLLYSOCr^SPKLESPLPLCU.TS^ 

SASKSNrx:SFPFSAIilHm"ESimx:FQITSTn.SGNOACTYWSLS^^ 

PEV0I^LVYSYEDWLPIDNIFNMS0PRTIPLAIJJSOTWJ«WKYDIL^ 

SPNCSRFSL0LKQTM3FENSPVDFYIVHAAHSCHWSGF 

CPn.0191 232039 231314 

glnO-ABC Amino Acid Transporter ATPase ^ ^^^^t^r-t^f^r 

QHHFPVFLGYOKREXT/KriRVRNLAYSVNKKKILDGVTFSLERCHITLFV^ 

LRALACLVOPTOGDIWIEOEAPALVFOQPELFSHIfrVIX^CTHPOIHIKGRSTEEARDO^ 

FELLHLLDI EEVAKNYPDQLSOGOKQRVAIVRSUaiDKHTLLFDEPTSALDPFATASniH 

U.ETUWELTVGLTn01QFVHSCU3RIYLIIXX7rVAGVYDK^ 

AQ 



E : r p lYYLrrroYLTODF : e/. 



CPn_0192 



232643 231984 



glnP-ABC Amino Acid Transporter Permease ^ 

E^raVDHWlAIARIiUlOCGYTLD/SGIGIU:GSru:iXICTVTSLYFPSKLTK^^ 
TVIRCTPLriQILIIYFCLPEVLPIEPrPLVAGIIALSMNSAAYLAQilRGGINSLSIGO 
WESW*VLGYiaCYOirVYIIYPQVFKNILPSLTNEIVSLrKESSILMVVC^ 
VSRELMPMEMYI.ICACLYFL>frTSFSCISRLSEKRRSYDN 

CPn^0193 233144 232686 

♦argR-Arginine Repressor 

KUiGVFHKKKVriDEALKEILRLEGAATOEELCJUCLLAOG^^ 

ACERCARYSLPSSTEKTTnWLVt^IRKrUSLIVIRTVPGSASWrAALLIXXJLKO^ 

LAGDDTirVTPIDECRLPLLKVSIANLLQVTLD 

CPn_0194 233162 234241 

gcp-O-Sialoglycoprocein Endopeptidase 

EVPWriKGNVFFSWFFMLTLGLESSCDETACAlVNEDKOIUWI I ASODIHASYOGVA/PE 
LASRAHUlIFPOVINKALQOANU.IEI»©LIAVTQTPGr.IGSLSVGVHFGKGIAIGAK)CS 
LIGVNHVEAHLYAAVMAAQNVOFPALCLWSGAKTAAFFIENPTSYKLIGKTRDDAIGET 
FOICVGRFIJGLPYPAGPLIEKLALEGSEDSYPFSPAKVPNYDFSFSGUCTAVI.YAIKCNMS 
SPRSPAPEISLEKORDIAASFOKAACTTIAQKLPTI IKEFSCRS ILIOGCVAINEYFRSA 
I(yrACm.PVYFPPAKLCSDIJAAMIAGLCGENFOKNSS I PEIR ICARYQWESVSPFStASP 

CPn_0l95 234172 235785 

oppA-Oligopepcide Binding Protein 

YSGNSYMRKISVG :C IT I LLSLSV'yLOGCKESSHSSTSRGELAINIRDEPRSUJPROVRL 
LSEISLVKHIYEGLVOENNLSCNIEPALAEDYSLSSDGLTYTFKLKSAFWSNGDPLTAED 
F I ESWK0VAT0EV3G r YAFALNPI KNVRKIOECHLS IDHFGVHSPNESTLWTLESPTSH 
FLKLLALPVFFPVHKSORTLQSKSLPI ASGAFYPKN IKOKQWI KLSKNPHYYNQSOVCTK 
T ITIHFI POA^^'AAKLFN0GKL^JW0GPPWGER I POETLSNLOSKGHLHSFDVACTSV^ 
NtNKFPUWMKLREALASALDKEALVSTIFLCRAKTADHLLPTKIHSYPEHOKQEMAORQ 
AYAKKLFKEALEELO ITAKDLEHLNL: FPVSSSASSLLVOLI REOWKESLCFAI PI VGKE 
FALUJADLSSCNFSLATGCWFADFADPMAFLT I FAYPSGVPPYAINHKDFLEILON I EOE 
ODHOKRSELVSQASLYLETFHIIEPIYHDAFQFAMNKKLSNLGVSPTCWDFRYAKEN 

CPn_0l'J6 235906 237519 

oppA •Oligopeptide Binding Protein 

KLKSYSKERSFMLRFFAVF ISTL^^ITSCCSPSOSSKG I FVVNMKEMPRSLOPGKTRt-I A 
OOTLMRHLYECLVEEHSONCEI KPALAESYTISEDGTRYTFKIKNI LWSNGOPLTAQDFV 
SSWKEIUCEDASSVYLYAFLPIKNARAIFDDTESPENLCVRALDKRHLEZOLETPCAHFL 
HFLTLPI FFPVHETLRNYSTSFEEMP ITCCAFRPVSLEKGLRLHLEKNPMYHNKSRVXLH 
K I IVOFl GNAhfTAA C LFKHKKU>JOGPPWGEP t PPE ISASt^ODDOLFSLPGASTTWLLF 
NtOKKPWNNAKLRKALSLAIDKDMLTKWYCXJLAEPTDHILHPRLYPCTYPERKRONERI 
LE.^OOLFEEALDEL0^T^REDLEKETLT^STFSFS•/^;RIC0HLRE0WKKVLKFTIPtVC0E 

FFT tOKrJF^EiJrr^::LTVNi>rrAAFrDrMjYui IFA^ipa:I jpymu>d::hfotllikitoe 

HKKHLPtOL 1 1 F^UTiLa ICH I LEPU'lirNLR lALtlKNT KNFNLFVRRTnDFRFIEKL 
• •Pt>A *t\ in'^'t't** ill** Miiuiutti I't.irctri 

KNVUhK.ItJnJJPOliKKlKVtrKMFltmvnU'LLFIlX'nxr.^JYniTKHKOiTt.tEPrHOOrV 

af:;i lyjAKit/NMUi i Aoi XFCx ;LTkErrt irfc::>NnLF-u\ r A:;r<YTV::EDFC:;'/TFF i kdsal 

W::DfTrMT:;KO[RMAWivYAgENi!PinOIWLNF:n'r::SNAtTtllLD::rNPOFPKL^ 

Ai-%\i KKPENrKt.F:>'t'rrLVEYFpf;iiN I^(LKKNP^^r^DYtlcv^; INS iKLLi r pdiytaih 
I J j^i<r ;KvrA^ w iC^ u r vfEi J iKO: VYM YYTYPVtriAf^c x:L^r^KS^H LTJDUX^RH 
If * r bKf : : 1 1 KhWi rr^joPACTL; :ni'.AiVPNOYKKOK PLTPy KK LVLTY p::o I LRCOR t A 
1 : 1 1 y w.^AiV : t r;i . 1 1 .i i :i .i:ym l.ivnkrkvqdya i atijtt :vay Yr<;ANi . i ;;p:f.dklu;nf 



CPn_0l9H .-K.i.ai'i'i- .-W-*^ .... . ' . 

AYL^RDASUUCftLYECLTPrnXWlAlJUJ^EGVT'^ 

nFSsiKOLYFEEF3P3[HTLLJV'IKNSSArHNA0K3LETLG:CAKCCLTLVITLE0PFP 
YFL-^r ARPVFS PVHHTLP.SJYXKCTPPSTY ISrCPFVLKKHEHONY;.: LE^ 

K^/l'KAKiV* FvE/\KL" ^ii-- i^'viiLJ » ^ -1.- J^w-i 1 ZA^'Ll 1 Ull-TULr: IKIOiJ^^i- 
YHCFLKKRROGDFFIATGCWSAEYVSPVAFLS ILCNPRCLTOWRNSDYEKTLEKLYLPKA 
YKENLKRAEMIIEEETPIIFLYHCKYIYAIHPKIOOTFGSLLGHTDUWIOILS 

CPn.0199 241018 241983 

oppB-Olxgopepcide permease 

KCLICLSLVFSYIKNRILFNIXSLWIVLTLTFLVMKTIPCDPFNDEGCNVXSEEVW 
SRYCLDKPLYQOYTOYUiS lAKLDFGNSLVYKDRKVTNI ISTAFPISAIUajCSLTLSIC 
aCIAIX:TIAALICKKJCORRYILGASILOISIPAFIFATIXOYVFA\nciPUi>IACW 
TILPTIJOAVrPMAFI I0L7YSSVSAALNKDYVUJVYAKGL5 PUCWIKHILPy^ 
SYSAFLTWITCrrFAIEJJ IFC IPGLGKWFICS IKQRDYPVALCLSVnfGTlJMLSSI^ 
DLIOSIIDPQIRYAHGKEKKRK 

CPn 0200 241996 242868 

oppC -Oligopeptide Permease 

EKwbaLMENLSSAPSRSrrfKSIIONKMLVUILTTLIILMI^ ' 
rL\rePCSRFPFGT0rLGRa!FARTLRCIJtI^LLIATIATLID\,'CVX;UWAT^rAISaa^ 
DFLKKRTTEIIJ'SLPRIPI I lUiVIFHHGIXPLIUW rrCWI PISRI lYCOFUJJ^ 
PFVLSAKAMHASTFHIUaCHLLPNTtAPIISTLirriPMAIYTEAriSFLGLCtQPPOAS 
IXrrLVKBCINAIDYYPWLFFFPSLIKIALSISFNLIGECAia'LCLEECSHC 

CPn_0201 242810 243715 

oppD-Oligopeptide Transport ATPase 

ASrSSARGLKHYVSKRDLMIKfU2iIKDLTrrSTNPKRTLIENLSL0LKENR^ 

CSClCrriTWaLGFLPDCLim:SI[J'EDIDITKLSPKELHKIRGOKIATIL0KMCSL 

TPS!«ICM0IIETUU3HH»*KKEEAYNXAM0LLTOVCIPNPKySFSOVPFa.^^ 

VIAIALASaPKLILADEPTTAIXfiMSOAOVrUlILRNIQOOKQATILLVTHNr^V^^ 

DICIIKOIKI^ETCTVEEIFl^PKHPYTLKIX^ftVSKIPIKKTSSPILKN^ 

GL 

CPn_0202 243682 244500 

oppF-Oligopeptide Transport ATPase 

VPTSNEYARWFKrTLLSIKDLSLTIRGKKILrmiNLNLIKCSYLTIVCPSCSCICSSLALT 
ItjLUC ^ n^T I TFWPKIPRARKVQVrwODIDSSI/fPCMSIKCIISEPLNIIGTYSKA 
BONKEIVNVUSLVNU'KSVUttJCPYKI.SOGOKORIAIAKALVSK 
NQSLILDLFQTIKKEYQNTLLFITHOMSAAYYIACT I AVMDOCSLVEHACREKirSTPKH 
TTTODLLDAIPIFSLISTEUEPSEEYELOVASK 

CPn_0203 244966 245802 

No robust homolog present in Genebank/EMBL as of 11/7/98 

IVPU^Kl«nCETSC«NTVTFSPTIOKSFSLrU£KU>SYFrF0GTRT0IL^^ 

AWCRCCKWIEKIIKII^FlU^LVIIAFIUlYFUiKKnnCQririPIOaa 

SRPOAVEaCAVREISPArrSIPRKYQLIRIDTPKDDAPSILFPIGIEIILKajCIDTMOS 

NLFLKROffiFLGHPEEKAUTSICSIEKDOEWMSl^SKKLLITHFL^ 

FNPENCRCYrSEISTAKIHFHOHCRYCPIRSSGPIMKEI 

CPn_0204 245691 246002 

NO robust homolog present in Genebank/EMBL as oC 11/7/98 

PR£MAWmlWKySKDPFS5AR5IWANPF^GTHHEGNIKIKG^CY0IFTRLKKIiSZSF5S 

YNSINPMPYFFOECCFVYWESOrKSALOOHGILOKQrrETFYROT 

CPn^0205 246073 246327 

No robust homolog present in Genebank/EMBL as of 11/7/98 

lEDSIKCYCSASAFRNPPOLLLKFFLVCEELCILTVATHRALLETPLALSFFKELICtlCYV 

YRAKOILOLHNYKCFTILNTSPLCS 

CPrv_0206 246346 247161 

CT203 hypothetical protein 

I VDRRSPACYDS INSDAICVSLLMDISHILEDLAYOECILPREAI EAAIVKOMOrrmi. 
H I UiDATQRVPEIVNTOSYOCHLYAMYLLAOFRESRALPLI IKLFAFEDCrrPHAIAGEVL 
TEDLPRILASVCNDOSLIKELIETPK INPYVKAAAISCLVTLVGAGKI PRDKVIRYFAEL 
LNYRLEKOPSFAWDNLIAGICTLYPGELFYPISKAFDGGLVtrrSFISMEDVENIIKEETV 
ESCIHTLCSSTELINOTLEQIEKWLEOFPIEP 

CPn_0207 :47208 248617 

ybhI/sodiTl-Oxoglucarate/Malate Trans locator 
VNKKKRFLSLLFLTAVLLG IWFSPHPAS I NSNAWQLFA I FTTTIMCI I FOPVPMGAIAI I 
CISTLLLT0TLTLE0CL5GFHNPIAWLVFLSFS lAKCI IKTGLGERIAYFFVSALCICSPL 
CL3:/CLVIT0FFLAPA I r5\TAPA0G I LYPVAH'SLSDSFGSSAEKCTODLICSFLIKVAY 
OSS'/ITSAMFLTAMACNrLVAALAGHVnVSLSWVLMAKAAI t PCLPSLFLHPI ILYKLYP 
PKtTCCEEAIRSAKLRLKE>CPLKKEEICTtLMIFFLLV/LWrFCOLLCISATTAALICLS 
LLI LTNI LDWOKDVTANTTAWETFIWFnALIKMASFUIQLCF I PLVCOSAAALVSGLSWK 
IGFPLLFLI YFYSHYLFA^NTAH rCAK/P IFLAVS ISLCTNPI FAALTLAFASNLFOCLT 
HYG3GPAPLYFCSHLVT\*CEtM>SCFAL3 rVN rVIWIG IGSLWWKALCLI 

CPrL.0208 :4<»935 250602 

p f kA - F rue t ose - - P rho;:pho cronsterase 

SVAVILMItPLYVDUTT t I:?SY3PPLPKEF0EAA.SLt AVPOTSHSKPWPCVKTLFPQTYM 
LPyUCFVOGENWirPPLkT..TW!4FStX;PAP(KKNVtOGLFNSLKDFHPOSSLVCFVNM3rH 
LTtfftKf; tDITEEFLJKFRN;:;nFNC ITTGRKK TVTPF^JCEACLKTAEALDLDCLVI lOCD 
I ;,';(/TATA I LAEYFAKRRrKT:; 'AWPrT I d ^DLQl tTFLDLTFCFDTATKFYSS t ISN ISR 
nAL:XKAMYIIFtKI>i:KL:A;:MtALB;/a//l'in'nCAL£'jEEIAnKNLPtJ(TnKKICSVXA 

upyiAMEKYYt w I L I PFi : I T r.K I TE 1 1 iH . [TK t F-iL; :e*/edk i:;nL:;rEr*oRLLKSPPAP I 

I CO r r DRfJAI WNVNV:*K : ; ^/f/KLL t Ml .V::NI It/X^YFr-MVPFNA [::HFLGYECRSCLPTK 

mATTT vn\uy{( :«v : 1 1 A'kn-. t :: v ;Yr j rr r k: :i A( :r'KHr/rKi.RA i r wkmftvkooaogtlq 

l-K IKKYLVDI(::rrAKHK!-*KI.V!'K lWAI.mj;;Yr'K(/;i f/^I CTI'l'KMH:;tJNFPrLTLLLNHN 

^w^p|^«^•.*:rEI^fT^'Y 

»;if !_').!»•» JM .:M 

tU» fohtinr, tmim»|iwi tMt.'r:*-ri) in >»-ii*'UiriK/hXlJl. .j;; ot I L/V 'mh 

H::;;nmni4<MLTY::r:rr!*.:^»niT::r.Y i HKKi.i/.-rYKijFv ;Km<vi a itiw a-\ t aybqn c 
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No roDiisr nomoioq present in Geneoank/EHBL as nt 11/7/9R 
YQKLWERER FKT r REXEHAT r TTMLVELEALKREFAHLKDOKPTSDC EITSLYCCLDK 
LEFVUX:1^;DK FLKATEDED'/LFESOKA I OAWIIALLTKARSVLCU^C ICA I YOT 
AYL3K'/^mRAFC:AJErHF:,KTAIRDLNA•ry'LLCFRWPt^KI^EP/C>^;NIXr^rt:IAKi^ 
rrFEKETKEUJE.':LLnEEHAMEKCS [ODLORKLrTCI I lELHDVSLF'TrSKTPSOEEYOKD 

CPn.0211 252765 252463 

No robust homolo5 present In GenebanK/EJlBt as ot U/7/98 
ECVMSYPD rSWOASS rOSALLHKTSDOIQOKRCFKOSTFVI LAVSLVr IGSLFLLAGVA 
: LTVFSKCVl^LVFC^^ IVLCLIXLACGVCIXVEEAKSLL 

CPn.02I2 254081 252888 

No rooust Momoio? present in Cenebank/QIBL as ot 11/7/98 

ELSYCVWSIYSEILSFSELTSCKHSLFPFCPIETASIRIHHVFNWr/CLIILGTLFVC 

C>CMVFU:VFSTYLLCHSSMILCUiISrCLAIXXFXERYCLEPKELr5\^^ 

: rOMODOIAOLAREXJSLEOKKffaiRCFSARIXVLECSKTEICKOIUCICVPRNl^IO 

AOEONSILEOCKEAIXFRRKSAOEIFKKLYDRKAAFWRSYTlEDUOrSEIKVSKKALSNL 

Y tGDVTECTAPHFU<EAYAtCRTAKhnJlNYVKVCVEDMRVNEE^^ 

E I ETOLENEniLFTSDSEDVLEEYO IHCIRVTMLHALWAZYNDEVVSRKP IDTLDRVRAR 

KAVEDC lErrSELOMCWHTICLELEIAOLYVD: U£A 

CPn_0213 254345 254190 

No robust homoio? present in Genebank/EKBL as of 11/7/98 
: LVVFSRVI FSNTWOIGIPRLELILPLWKJCEIIDPrCrLFSRVEGTri ILNIK 

CPn_0214 255768 254446 

No robust hotnolog present in Genebank/EKBL as of 11/7/98 

FLGLKE0YERPTYCNI PPAPHPORVDSKCCIASKVSTVVWALFILGIFFLSGSLATLVH 

TSCCVIXGAAI^ILC:GLVLL^VALIVFLCHKHKTRODLDfy^ 

SEIAVTFDCLQNLF0FHTTCDFSDLSOEL0GKFINCMEK&rt.TLEDEV^ 

RNFTTFCEOVKGIOSNirDUiEEKSSLyLn-YRIJlJCDWVUJff^ 

AIKGLriRLTSRa)KU3VKA0ERK)CriNEMSREFyJEVEICAn)rVDRAT^^ 

ARLF>CRTESLLEhIKKNEEAlJ<NOGLDPEItt^HPELFSPYOQLL:Unri^SErVlJtt^ 

LISGTVTSCLTtZECENRMRAASTGLNAIXVRKMFRGAIKSAYFEKLTEIEKELRSLQD 

VIKSLEI^IHKIFCDIVTErr 

CPn.0215 257039 255759 

No robust homolog present in GenebanJc/QlBL as of 11/7/98 

LTSSKKOVMSSAIARIX:FPSPSPOPSSTr/:VHPPKVTCSLIL^SLrVLCVIXLC^^ 

VNAIFSFSVLTVGr^AGmCSUJ.ILCLIFFVSyHR2CLSEATRSL£OKITLEyOPWAD 

LRKEUiEVOEWSNrLIX)EWn)rKEVVA0HKSQFATrECDmJX;REV^^ 

DVAIXTErjCNIWGPLEFLRJatCDRLOCEIDlOJUCEVHKVCKSCIJ^^ 

KIEOECYRI3ia«CVEKL£\^PEGY7lRELLEVlJCTR^ 

TVTSEEEL0EALDKAICA£LLDI0VRKSVVEDLSCEPTLI0YHUJU.YEVOCRI^ 

TFSSE0EKW*EEYEAUCARIRKTUlVKU30VRAWAFVASTrDliS£SESUXa© 

AHDOFLD 

CPn_0216 257623 257174 

No robust hoooiog present in Genebank/EMBL as of 11/7/98 
NKARTKNPVTFDRIOTOFrPEDrrSLRINSYIVAGCIXILCVVI^rL^ICLDrGLVGLSA 
GAAPTLGLGCLIFAIJLFSrSLILU^OEKRVPDVLSLYI-EKEVPOYETPLYKEDLESER 
DMSAISERIjG I lEEKLRIAEKFRYSDSVFV 

CPn,0217 257881 258579 

yp<ip 

PKCCKLKC^LSVNELrFCF(7^FSVVVU^/FFASRCKAWLTt7^LLS5I^C^ 
WSF^VrSADVYVIGLLTCLWYAREHYEKNDINDAMLCSWVIS lAFLVLTQLHLFLIPSPN 
DSSQEHFIALFSSTPR IWASLVTLIFVOIVD r KZJTFLORVFSKKYFAMRSTISLLFSQ 
LIDTI IFSFU3LYGLVSNLCDVMIFAMLVKGIV1TLAIPTLTVTKAVLDRRSS 

CPn_021B 259064 258582 

No robust homoiog present, in GencbanJc/EMBL as of 11/7/98 
I FLSKKVrFESYEDFANVASSWPKSLRAt.VOGRYFVDSELKETPYR IHDFKKTPIHHRLY 
RSLPI I STICG I IRLIEAHSCP IHPRDKMKYRFEVLOAVIEILCLCVuILVTDI ICCFLA 
FLVAIIL3LLLYCNSTFTCV0NLSFTERKLECIGEAVNFLA 

CPa„0219 259348 260472 

tgt-Oueuine tRNA Ribosyl Transferase 

GSSIALKFHmHOSICKS0ARVC0tErSHCVrDTF.V-/PVATHCALKGVrDHSDIPLLFCN 
TYHLLLH PG PEAVAKLCCLHOFMCROAPI ITDSGGFO r FSUVYCSVAEE I KSCGKKKCMS 
3LVK rTDECAWFKSYROCRKLFt^PELSVOAOKDLCADI I IPtXiELLPFHTDOEYFLTSC 
5RTYVWEKR3LEYKRKDPRH0SMYCVtHGGLDPE0RP.rGVRFVEDEPFrx;SAIGGSLGRN 
LOEMSEWK ITTSFLSKERPVHLLCIGOLPS I YAMVGFC IDSFDSSYFTKAARHCLIt^K 
AGP IK IGQOKYSODSST IDPSCSCLTCLSG ISRAVLPiiLFKVREPNAA : WAS I HNLHHMO 

Ovmkeireaiucde: 

CPn_0220 260660 261236 

NO robust notnolog present m Genebantc/EMBL as ot 11/7/98 
rrSFUCKKCI FYMSKESIRSYSEISTPTP IFRCTPSKECVAYKLOLRSPAKDCILRNRVS 
LKGALLR:; t PFYCSFLCAKRIHSAWSAKDAPCTTar/^HYLVCGLELLCLCNAA/LACKVLA 
TALKFLFSKASSK XKOMKWR EKARNLAAKCnVC^ : KEFCSVDLTSCFTRCFRLRNRVA/EE 
OA3EN0TVREEIV 

rr'n_U22l 2til62l 2620»;.| 

Nej (otiurt homoloq present in O^ne^in^'./EMOL ll/?/***? 

OAI JXYK YF.rC tQMVNRYKS.';AEFSADHYYODHLVP^r/KRNLRCt-\rVENEVCLFEEriNt. 

IX-^-VMATWriK-lSItjCLriHUtSVWrrrODPKOSKr^irCFnTALCILCTULCnVLLrKIT 

r*r t u.tLFTn:i.u:YFMYr:A.\Y3DFMp r 

•wiMk ir.v to Dacr^r iopn.Kif TKM (I'lt-li 

Of:K F^K rWEKLR KLN/VFEI.TOPEE-/ RNRWLM r\*: ,K^P.F' TrPOMAKVW- :YR'~VHEAGLYE 
Kjft TLTr;iTPnKIILn«.»YC;;LVKf JILOLFLKFLRKM t:;PHK I RYFFA.'iTAVrrrKLORPHYHL 



AT. irt • ^netNifin. EMBL as of ll'7/98 



; n'MLICRY.'7"DDCFTE^\TKNTi<' : ^KU:F^«•RDNt.EGLTNPrSErVSET3SSrKDSV^RSL 
P r UZZ I UXARUV::n"L:zrNDPLDCroEKIWirr IFCALKWUiiiLiaLFfrli^ FMILHC IF 

HLvtGFcx / "'. r:r":'': r',Jr7 

':Pn.022': 263402 :e3';74 

No robus: name log present in GenebanK/E>QL as of 11/7/99 

YTFKNPKKNKKMKFNSriFLENTKHYFCrFREGFVRDRHCLMEASDWLLSTErTriRSr:. 
...... ..... — -^^p.-... „ - . 

i.*t'n_U^J'j Jnitj-ic ^o434i 

No robusc nomolOQ present m Cenebank/EMSt. as of 11/7/98 
NSFTIKFIXffriCKAINSQrnTPOPNLTnAEPIASRAOCKS I AVI ISLFALGMLLLCLS: I 
LI S I P r PCLAAOVALCLC rVSLILCI AIJWICrLCIXLRCKCVPOKPCTLPSESSKOPSE 
GSTPTALPWOACEFLEKVOVSATP ILLPKNKDEEl^AKVKKEGAEAASS IKOAVLESTEK 
LIDARKOEESRREARKKIVAEEAEASRKRIOQOKAADOEALRKRJCEEVAKRK 

CPn_0226 264545 264967 

No robust homoiog present in Cenebank/EM3L as of 11/7/98 
ArFNRKRMPYYAOTLEFIQCrrOSLCPLFKYGrVRHHYKGOLEIEDASHDWDrLEPPSWK 
RTLL^AI PILCSVIGLCRLFSIWS I REPODSOEYXS IFVfHTLCAVLEILGLSrVALILKI 
LATFIKAMPGLKRVATFLFYS 

CPn_0227 265467 255009 

dsbB-Disulf ide bond Oxidoreduccase 

KERFNI FVSCKLUCEI WINFI RSYALYFAWAISCACTt rS IFYSYILNVEPCrijCYYOR 
ICLFPLTVILCISAYREDSSIKLYILPOAVLCLCISIYOVFLOEIPGMOLDICCRVSCST 
KI FLFSYVTI PKASWAPGAIVCLLVLTiaCYRG 

CPn.0228 266242 265412 

dsbG-Disulfide Bond Chaperone 

VKDRAOFI/XLKEKFSCSILKKENAFEFYVFCSIKOLTNSSLRGPLNmL 

CrcrLIKKKHTILPPKAHIPTMAKHFPTIGNPYAPINITVFEEPSCSACAEriTEVrPLL 

lOCHYIDTCEISFTLIPVCFIRGSKPAAOALLCrYHHDPROADIDAYKEYFKRILTYPKEE 

GSHWATPEVLTKIAECUCINSGRSVKPKGLEOCIASGOYNEOIKXNNtYGSOVLCCOLAT 

PTAWCDYLIEDPTFHEIERAIQHIROLOAVECOHDD 

CPn_0229 266163 267560 

CTl7e hypothec ical protein 

MSKAFSFlAIEOENFSnCFIOCSAt^FTWTANLTKSTFTriLLUIJUaCTC 
LENIYRHFRYRriJa2IILPArunXIiCSPt?rLNYT0VDVIFSDRU:SCU.IFI^ 
KRSUMXIAPLGIWVTLFACVACRSPriFANDTLICFAILAVVCISPTRPEALEVCPTLP 
EGFSYNPSACGRRAAVUT^LLCMLEARYLTASSLCrrSSOSSNFUiYSS 
VLSLAGSERRWKTRPKIVIATALALTCVIILTLLPI ILHOUlYDCWLCLa.TIEPALWV 
FAYDCTRATUlYISOFl^KRALTRASFFCSEYYKHTLSWEERTVLPLRKAYKaA?£niS 
rPINOLLAILVATVFVKVNSSMCLPTFPRNFLNICCWr I IVLFILATAESLRHUIWMNLI 
FSAAILf SPVLTH I PVESPHFU»r IVTCLrLIILSlCKRRRTICRKL 

CPn.0230 26B277 267576 

CT179 hypothetical protein 

RFKKALIYKSSOPLVTTSSSi:^RYV\rt.TCEEKVACYKKAf>miWHCAPAIIIJUUtf^^ 
IFGFVUISl LLCAPLECAS ILYDVtLWLLPSILVrTUVLPLMIYAYSHHKOVIALHEft 
rTOSNYKEIYDHCEKEKKTProaCAI^LyiESaVLWEYSKRFSSMItXacrtJCIIP^^ 
ESLXHOELIQKAI£RAK£KIYKNXNQRE]aW£R£AKK£A)aiASKTW 

CPa.0231 268996 268253 

taua-ABC Transport ATPase (Nitrate/Pe) 

POATVSIODRGFSKLOAHRLCYSCDNQVILKDASFOASPGTrTI ILCSSGVCIOTLFRLL 
AGFLPWEGELLWNGSPLNRKDVAYMCXJKEALLPWRTALKNKTLSTELC IMTSHNALSME 
RIXErrHNrDlJG0UJ)RYPDELSOGOR0RIALAA0Ct^LKPILLIJ)EPFSSUJVL^^ 
YODWALAiCKENKTVlXVTHOFHDVSCI^DVLYVIKNKTLTPVPUJPSHRPLr^ 
OLXKHLYT 

CPn_0232 270134 269232 

•similarity to S ' -Methylthioadenosine/S-Adenosylhooiocy8c»ine 
Nucleosidase 

tCKFLHRRFLFLI LSSLPLVAFSADNFTILEEKOSPLSRVS I IFALPGVTPVSFCQTCPI P 
WFSHSKKTLECOR I YYSGDSFGKYFWSALWPNKVSSAWACNMILKKRVDLILI ICSCY 
SRSODSRFGSVLVSKGYINYDADVRPFFERFEIPDIKKSVFATSEVHREAILRCCEETIS 
THKOEIEELUCTHCYUCSTTKTEHTLMBGLVATGESFAMSRNYFLSLOKLYPEIKGFDSV 
SGAVSOVCYEYS I PCLGVNIU^PHPLESRSNEDWKHLOSEASKIYMim-UCSV^ 

CPn_0233 270439 270248 

No robust homoiog present in Genebank/EMBL as of 11/7/98 

EKARTWFUIKV.XLFtXRISRRSYVOE IG I FFHLETPOLK IVLCAFVSTFI WEMDVSLKN 
KCQS 

CPn_0234 271246 270S48 

CTlfll hypocnetical protein 

FI HL03CKKALLS I WS ILAFH P t PCMG'/EAKSGFLGKVKCWFSKKEIOEEARI LPVKDS 
LSWKRYDYTSSSGFSVEFPCEPOHSCOrVEVPOSEITIRYDTYVrETHPDWrVYWSVWE 
YPEKVD ISRPELNUJECFSCfmOALPESQVLFMOARO lOGHKALEFWIVCEDVYFRCMLI 
SVNHTLYQVFMVYKNKNPOALOKEYEAFSOSFK ITK IREPRT r PSSVKKKVSL 

CPn.0235 271395 272177 

kdsB-deoxyocculonosic Acid ."ynthecase 

VFVRYLUIKPEEJECU: CnVLPARWN,':::RYPCKPLAKI HCK3L rORTYENASOSSLLDKI 
WATDDOHI IDHVTDFOr;YAVMTr;PT':.';riCTERTCEVARKYFPKAEl rVNIOCDEPCLNS 
EWDAtVOKLRiTSPEAELVTr^/ALTTOrrEE I LTEKKVKtT/FDSECRALYFSRCPIPFILK 
KATPVYLH U^VYAFKREALFRYLQI{r:;rrPLCDAEDLEOt.RFLElta;KlHVC IVDAKSPSV 
DYPLOIAKVEOYrTt:u:N/\YF 

f>yc<: '.TP rVtirhctauij 

:;HT t YUMPFKi- r FLTT/ W.':::LGKCLT/^:;1 AL r LERORUA/AMLK LDPYLNVDPGTMNP 
KKIKTE t -rVTDTt TVETDLDl/^HYt f RF.':::/^AL::«Mr;.':AT.'y */j I YARV r KRERBG0YLC3TV0 
Vti'U ITttZ I rOV I LDAAKEICf-r.VL t VKKyrr TGDI Eni.PFLEA inOFRYDIISEDCLNIH 

»frn/PYU>/\ADEVKr:KrroM::^/vri.H;i^nri;Aiu;R:;EKPi;roEVK.^Ki:;LFnw 

«nAff;KYVOHRnAYK::tFEALTIlAALf'I/yrAAKtrr-tDAECEflLTMELr^:DAt:LVP(a:FC 
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MCLRIEVrrrCPPOGLCEUr/CDHPWMIQ/QFHPEFVSKLISPHPLFIArrEAALVYSKDA 

.;hv 

•:Pn_0:37 271741 274214 

"?t.nW;IS.^KrC-rK.XY'unt3YCKKRrOLAYAAEPLLL7tP 



tPn 0239 274210 275838 

GOLTARKIXPALYHLTKECRi^OOFVC^CFARRMigJRODJ^ 

WEDFC»RL^rHRSEFD^W^CVTSUCDSLEDU3KTY^^ 

KHKLFYKNQC30GKPWSR\aiEKPFt3lDU:SAKOL«5CINENU^^^ 

:.£SSmn3ADEIRKaCIKILQRISPFSBCSSrVR^GPG^^ 

KDSRVETYVALmI^WPRWU7^^FYLRAGKRU0CKSTD^SIIFWC^^ 

3 r EtTOLi: : R lOPDBGVALKFNCKVPCTrWIVRPWtDFRYDSYTOmPEAY^if^ 

: ICORTLFTCX:DEVMASWKl.n'PVl£E>roODSSPSFPNyPAGSSGPKEXDALinUX3^ 

RPL 

CPn_0239 275863 276672 

devB-Glucose-6-? Dehyrogenase (DevB familyi „.™r»r 
KSISHTOIGIETKATLINnrarimilXTKOPSLFIDIASKEWI^^ 
SGGKTPLEIYKDIVINKDKLIDPSKIFIJT«D£RIJ^PITSSESinfCOAMSIUlW£^^ 
OinWETENPDGAKKYOaiENKIPDASFDMIMLGLGEIXnrrLSI^SOTSALEEDn^^ 

FNSVPHLE:TERM7LTrPCVHXGKHV\nrrWENKKPII^ 
RSPLFWI ISPESYDIADFDNtSSIYKMDIL 

CPn_0240 277861 276698 

NO cobusc homolog present in Genebank/EKBL as of ll'Ti;^^^ 
l.VYFMVrSPSSESVVKANSV\mSNFCYFLE141CFVSPSESTEV^ 

RPTl>w^^^G^cwyJ^^X2JL^^^s^GILIMCFSQCXSCC7^PE3CET^ 

LGPTLCALVYCAYKWUIKMIYSLWAKAKVIJIHPAONVFWRAACVATI 
KLYKSAMIGSLWSLIASLALIALTACIVLVLn^VXPGAAWITAAMMGCCAAGO^^ 
SLLGlWIAIVRKAKKQEACVGHLTNWlJlTAVSEAIXHDPSHFXTrWAIARDm 
YGHLFSNEEVAOLVQGGAPGGGSRPSOHYOGSSOYONRRGCaOffGGSHrCaOOGFAGSH 

rGACYPTAPTMPSAPPPFPPPAYCrrnfG 

CPn_024l 279372 278203 

No robust hofflolog present in CenebanJc/DtBL as of 

I FLVKTMSAHI SI^SSHEAS I AS^m3VRDVLVSlAMDE^VEH^f^EILPIKVFLARCTLSS 

TAtlODUaJNA/ETBCEHHFOVYSNISLKMIYQRFrEKXFCIOCCPI^VTDSHHrOPOCA 

LITCIFAAVLFTVlJaVFCPTI/;iI£YSAYKIYOLTKKISSLSFTHTEVINSVOK^ 

HRSGAVAAAAASOSTIKACKVrROSTLIFFVLCLIITISLW^IVGLVFALrFLDPGAPA 

VWAAMIOCCAAOGTGIU^CrUJ^SWSVOKSOBGVHHMHTAI^r^ 

LPITPGTKKVLTOS IRRYQQFFSODEYRDIESEVPLNRQTrPPPSYETLFKEEGSDCSSN 

VI PRESPPAYSTIDSSNSPFPSSSPPPYVR 

CPn.0242 279975 279487 

No robust honsolog present in Genebanlc/EWBL as of 11/7/98 
KSUCYCSLYOFSOKPTVILMACSI FFRMSOGDYDDEPLSKKTACLNArarTMLYPVIAVVCA 
WSVVLLILKVtFLLLSFPFKI^SASSALPGERVSLCSHFKCLyGGGLPYLLACLLrVFV 
IGTAIHGF I ISHRTSEDARLSSAIVntOAP ILOLACHSGLIKP 

CPn_0243 280609 280133 

No robust homolog present in Genebank/ML as of 11/7/98 
IMYNYLVFLUCrVKGRI IMACS ICYKLCNANEPDRFVASKVALVADILLYPrMAVICAW 
FAVLKVVKLLFlAIKFLVfnriAACKSRPLPSCKeiFQCLFGPKDKPCPSIWIXKa 
I IGTLIYSTI nVQSDTTIRLRYri ISPAYQVGSTAIINW 

CPn.0244 280906 281556 

ddk-Adenylate Kinase 

GAFEVTKCSVF IIMG P PGSGKCTOSQYLANR I GLPH I STGOU-RAI IREGTPNGUCAKAY 
LDKCAFVPSDFVWE I LKEKWSOACSKGCI IMFPRTUXJAHLLDSFLKDVHSNYTVI FL 
S I SEDEI LKRVCSRFLC PSCSR lYNTSOGHTECPDCHVPLlRRSDDrrPEI IKERLTKYOE 
RTAPVI (VYYDSLCKLCRVSSEI^KEDLVFEDI UCC lYK 

CPn_0245 281627 282499 

ydhO- Poly saccharide Hydrolase- Invas in Repeat Family 

TCOKEIMKHYLSFSPSADFFSKOCAIETOVLFGER^/LVKCSTCYAYSQLFHNELLWKPYP 

'^HSFRSTLVPCTPEFH IHPNVSWSVDAFLDPWG I PLPFGTLLHVNSONTVI FPKD ILNK 

MNTIWCSCTPQCDPRHLJlRLhnfNFFAEU.IKDADLIXIIFPY\AaMRSVHESL£KPGVTC 

GF INI LYOAOC^hA/PRNAADOYA£X:HW I SSFENLPSCCLI FLYPKEEKR ISHVMLKODSS 

TLIHASGGGKIC/EYF ILEODCKFLOSTYLFFRNNORGRAFFCI PRKRKAFL 

CPn_0246 292955 232551 

rs9-S9 Rlbosomjl Protein 

WAKST lOESVATCRRKOAVSSVRLRPGSCKI DVNC)C5FE0YFPLE lORTTI LSPUCKIT 
EOOSOYDLl r RV5GCG LQCOVI ATRLCLARALLKENEENRODLKSCGFL.TRDPRKKERKK 
YGHKKARKSFOFSKR 

rpn_0247 283430 252969 

rllT'LU Ribonomol Protein 

0:;Y I IMEKHKOTm" rVKSSETTKlWYWDAAC)CTU3PU:3EVAK r LRCKHKVTYTPHVA 
VIV tNAEK\mLTrj0<KC0KIYRYYTCYI2CMf'XIPFENKMARKPNY n EHAI KCMM 
ll<rHU;KKyi.K:;L.RIVKGDSYCTFEGOKPtLLDr 

• TiiJI:MH ::k445J Jal^SO 

V'lV/Vl^iA Tr.in^or>rt»»r ATr.iB*f 

h: ;i'f ' rri.K: ntvrfYATF. :R::FR.':nA( :KK::RKNAirLP.riFKr:RLJVR';LL i eaknl:;ktiooOn 
'.'N l; t i.Tt)V:;i::i Ji#v :i:rt:; rrtjA:;; ;Ni:KTTiXHU/rrLDVP:;:*/;::LRFFOKCLKNO0LA 

rJI-KtMl t« :i VITjHKVt.I.t*Dl7r/LKNVI>irAL[ARKtl i:;K';:;rVYTRALEtLOLVNLEDKV 
iri't'i;: :ku > :i:K«.)nvA r ahal t nei-a t Ll j\DF.r:y:tiLDEET;;EO tllNLLLEOAJALCC r L 
•vntN»tni-v:i'':::t(U:vi^:M;KLFritN:: 



YKDsSxVCITKT/rSTYSlEIiEKOTYW/tCrVNPCl^PL^ 
SKLa4S>CFHLFFPimRIVFVKKOIENILT3LCVODYWEt53U4D^^ 
D0Vm.^^^IUILrVAC3NI^m^SMt.LVNNKKKE:CILXAMCTSSRSt^ 
ACCWIGTtFAriTLKNLOFIVKALfT^U^TRCTFTn'AFFCONLPNSVHPOAIY^ 

• • \ A V ' ~ \: .r P p.' ^^■'yt r.'"r ? ' *•* '■ * 

rl33-L33 Ribosomal Protein 

KDSSMASKNREI IKUCSSESSDHYVmrKNKRKTTCRLEUCKYDRXlJUUmF^ 

CPru0251 286036 287559 

•conserved hypothetical protein 

SPOSCLPWMSPFKKIVlflU.lJGfISFOKESRTLPi::REPR»frTKSLGSFKSVIS^ 

ISLXSRNL\mSEVKLGIUJCACYESTTrciEDADYLILrrrCAFU<SA^ 

VKKEKAXIIVTCCKrSNHKOeiJCFWMSHZHYLLSSGDVENZLSAXESRCSCEKXSA^ 

E>CEVPROLSTPKHYAYUCVAECCRKRCAFC I IPSIKCmiSKPI^IUCEnilLVNX^ 

KEIILIA0DLCDYCia3tOT)RSS0IXSLLHElXKEI<mYV)IJl«LYLYPDEN^^ 

SNPIaJ:.PYVDIPLOHIN0RILK0MRRTTSRE0ILCFLXKUl^KVP0VY^RSSVIV^^ 

TOEETOEIJ^FICECWIDNIX;! FLYSOEAOTPAAELPOJ I PEKVXESRIJCII^IOKRNV 

DKHWKLlGEKIEAVIDtWPETNUiTARrYO0APEVDPCIIVNEAKL\«HFCERC^ 

ITCTWYDLVCRVVKKSONOAUJCrSKA 

CPn_0252 288112 287576 

CT144 hypothetical protein {frame-shift with 0253?) 

ATST\A:AI>fILOTyOSHDDAASCSrRRACRreRYWl«WVPWNKrNQTST0ST^ 

YIDSSOTWMWU^OASASIPRIXRISIFKTKHCDWDtKTIXXJELLlA^AYEANONPL^ 

lElJ^MSTCSGTSYYlURPMOWLCSTYYAVRPGYrVLOIRSYS FRVQSFSWJIATLPrW 

CPtu0253 288474 287950 

CT144 hypothetical protein Ifraae-shift with 0253?) 

FCOCRIJ«SS I PTTOKtrr SIPTTWria ES Iin.TOB0KKTALTIGCNIATEim]V^ 

vDAax;LICO^roLSVGGNrNITPaTF^m^v^scRV^Il^sPFSYOOSua<Iaw 

EQPOQYVPTCYYKLTRVMMHORAALSGCKVCSCOIG*CESMYU;iSSIKRQK^ 

CPa.0254 289268 288459. 

CT143 hypothetical protein 

IPKICrtJG\AaX)mJriD0ATLS\rt3lNVRIDWLErRDU^^ 

QUaTTLSDGFWIYSKTOVSOTPVCNNISDPOSARDALTFSYYRJCTGCOAA^ 

CYYVXPmTlETKVAAITSKSVSTOUTPOFSRYADIEPVVKLKC)VGIYQVnWL 

KOnaiSATLXLNFVSCWncrLirTSOTRGCYSSDRTSVAVTAIFSVT^ 

NLESTIWM^^LMSLSTCVIWFPFPSNFVEVD 

CPru0255 290183 289329 

CT142 hypothetical protein 

TUJCVXMKNNXNNNBCYFKUDSTVaroiXAANU^ 

vsATOTSGT^VNt2^AO^^^^ssoISXOFla^^^.s^cALPKEIrDPVPAMrvRSPEYW 

KPLIGDFDFNSCESYLPLTCSEYTLYOSRNVNSIFRFIGWKOSrRELTVaWTAlOFlAA 

(nYIVSFTroKRWGWr©O«yVIYXNNGI£QVQCESTIYS0GCYAT 

VAPNPNDPNASDRYRACIFYLSNOCSSACIGNYSFSLLYYPODRC 

CPtu0256 291282 290398 

CT144 hypothetical protein 

FOXail>!SNPTPlCrKISIPTFVRrNI0SXNLTEIWKKTTFTVCGKV^^ 
- DCa.TCQSDLTXQKDXNIRfT'STNSMVrt)CRUn^SPLSYKNSCOT)ITDY^^ 
OEYVPFWYKRTOI W4A0RAAHSSGTVGCCSVPSGSYVPWNK1* IXJTi'lXHtTSG'l'LIYIDP 
NDSTIXVFTVMWVPKIJRISVIMAXHGSWLDlXnX^ADILIAA^ 
TTSRCSSYYETRPLQVVCVTYYAQNNGYFTFONRACGGLRVSFFSWIVALPYV^ 

CPn_0257 292136 291267 

CT143 hypothetical protein 

GVVMKRRNWXILPNASTPST^IVAENTGIKDOraJ^^ATL^JVDG^M)IE^^^^ 

ArrriTSPCEFTVOCX3LSAESS0FWVTTt5KCLEITSED0DGRWKFTWSDP0SPRnALT 

Y!WRNTCCOAU^YTYYSSSOPTnraKPIETVCQNPNPETYRISASAKIYDAVTRFPYr 

OrKAPGIYOVTrOIRRESGOHSCU»iPNLYLNl><ICNNKTLirAStm^ 

TCTrrLTElVATPPHDYPWLFLCTTICLDIKSMSTCVIWFPFOANFAEVO 

CPn.0258 292534 292133 

CT142 hypothetical protein (frame-shift with 0259?) 

CFSFCRLGSK^EKrTl;oa^'Arou-AAcrYrLTFTICICRv<^^w^c^»^ 

CTKLCCSTVYSCOGYSTIGYLSTAVYRDHSDIOPDPNNPSDKYMNNFLFVRNCDHSAVIC 
NYSFTLLYFAGDKV 

CPn.0259 293031 292441 

CT142 hypothetical protein (frame-shift with 0259?! 

r/FVFKRKTYNYFIE^^^TKWON^^ECYFKLDSTV0COLL^NI^n'FDK0 
SVOaiATFKEKVSATGLTSASTYKLNATGPAPSSITIDMKNNRLSNPALPKNPCDPVPAN 
YVRSPOYFFCAKPIBGTFKFOGSSRYLPirax;SWTLYOSSKACDVFRFVt*roQNSKKL 
HLOCTOPYNFLLOEP I S 

CPn_0260 294090 293548 

secA' Protein Translocase Subunit 

AYUJFSKRSCVEEDHVSKKtNRflDLCPCCSNKKYKOCCLKKEECyrARrmCKFKrSAEV 
LSASEOCEACDNCTKUTJRLSOSLTSEOKAAVCKFHOITKmeVMSKIO^UCKAOAKEEKL 
NTTEKLOOHNFEX LNTGENLAPPMESTATUiOOTNFVCEDF IPTOEDFR I SENSOKPPVEE 
0 

r.?nj)2t\\ J'M272 2'>5»» H 

yrA-jO-Pf-'Loop ."iiprtrc.uni ly ATP'ise 

Yr,mt?F t VFM:rrLLLNPFWMKAr:KR I E.'!LVRKALYT!(TMLANriHK I WAIiW:KDSLTL 
ULMUCA I.'T^RGFr-DLOUIAVN r''>':KYr/r'»AI'^WKPYLTh troOtC I PFRT CP.'IHYAPETP 
ECYrC.-^OARRRLLFOAAKE lOA^'A [ AF ;i " IRPDLVOTALUJLU IKAKFA(XLrVI.DMVI IF 

irrr t LR PL t FTPEFv r RKFAK w r :FAft vTt'Rci vv::( JO ;kaw::i jt li^tvFPLAiniN t A 

UArOEJIGr;nK;X)KI 

*;hi jHi»;i j'tMiss .:'»•'•» » ^ 

surK'UncE- lik»f Aci*) l+t»)::ph.ir,»»:»» 

(. t FN mKEVKV/LVLMNKRl4< 1 1 L'miAJ I tTAW W:U :f .Vr:Al,LF.,\N ft :0 1 Y ( AAfMAfSj:: 
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• :k. W/\ I. :*L^J0V/CASPYA7PCPVKEAWAV0C3Pt . .<LGUlTl.FE3V3PDLVrSGrNCC 
f/N :^•KNAV^r:?^• [Cyu^KOAL'/tx; [PCMALSODrmrsrrOODKAPEIUCALVIYLLSQPFP 

^L-rr,LN [ NFrr:; pfx;53wa>m Lvpp^OEFF^EEPOYUJSVNKNQYYvcK I scwr^ 

- EEuACMLENH E SVG P [ F.-^OWa P tCLKTLCEFOKTOENTNASLLSSELrrKIF 

CPn_02a3 2')6174 297136 

VTttJ nypocnec trr-il protein 

TTAL'^ P R KLRVR P PrXArMFRCFRMSHC PRPTKFSFPLYFSKTLSWri UX;FLAACC;V0 

• •■ , : ; .i. :■.%■[; :M:;; rvr'.v"::"-y r rr.~::. : : : r.*^«* . : : ?) kv i.rrccTE: r/ ; 

; £ I^IKKKCYTVCOIII.FWF^r^AL:;GIVYK^W^A^VSF[,TYCIATKVMDMVIU;LEDT 
KSVT r ITSSPRKLGH t LMETLG ICLTYIHAfZKYSCEPRKLLYWVERLQLSOLKElVHR 
EDPSAF I AI EliLHEVINCRRT 

CPn.0264 237730 297155 

ubiD-Pheny lacrylace Decarboxylase 

MKRYVVC ISCASCVIIJkViai KELVNAKHOVEVI r S PSCRJCTLYYELCCOSFDALFSEEll 
LEY IKTHSIOAIESSIJISCSCPVEATI I IPCSmTVAAISICIJU3NLUWVADVAr^ 
PLrLVPRETPLHTIHLEM-IJa^KSCy^TirPPMPMWYFKPOSVEDLENALVGlCIIAYLNI 
PSDLTKCWSNPE 

CPn,0265 298632 297730 

ubiA-Benzoace Occaptienyl transferase 

:<: 1 1 VRUVYFUJLVNFKYS I FS ILFLSASTVFALS INEI SONLSFKBGFKI SVFGAI ATV 
FARTTG rWNOC I DRF lOKKNTRTSKRVLPANLVSLNFAWVLSLTCSFLFLFUrKILRIF 
5LGIASLTUMIVYPYMKRVTFFCHV«LCLVYTVAILtWFCAFAESCLSKRI£FIAIX^ 
SVCMVI AAND 1 1 YA I EDTEFDREBGLRSVPAHYCEKKAVErAKVNt>rvSYLAYIFSGrVG 
SE^KEFYH-AI I PLWIUCVVRMYSNYSKKWBGESKFFIJWIAIAI^FLVSK^^ 
R 

C?n_0266 299181 299876 

No robust homolog present in Genebank/EMBL as of 11/7/98 
IMALOEIWNNPSOQIASSTSOrSKIWDRKTrACTVTLLWATWILSGIVU^ 
t^LSVPLSC ILCTTFAVTVCAVLF I TGLTILVRKSLC lEOKNEDUirUCIKTPTPPA^ 
SKFSVTCSrrSTVLGKALLIGAWSVFFLTCrrWUILCAGLVGL^ 
LADOECSGSADSOSNIVGIGEPKAAOEQKWYKMAWRGEDCrPTAIRLTPEK 

* CPn_0267 300122 300910 

No robust homolog present in GenebanJc/EMBL as of 11/7/98 
VS IMSll^C^NAUWPEPA^rcl^WWDPKYINODRICTFACTVTIX^aAT^ 
AMGSPGLSVLVSTIIGTSVTTLGTALFI IGLVKLIKKSLAWIOYOKYFOENAntOKYEPFS 
IPKNtWVHKLTSCLPSPLDIESPSPEASTPVSKlJlIACSGVAIVI/^VTlXICAVVSV^ 
TCYU)lAIXrVCFAClXrrALFVCGU«;LRTHSLIAOGIMY^ 
RNEINTYLTEECRQOKREKALLE 

CPn_0268 300914 301316 

No robust homoXo? present in Genebank/EHBL as of 11/7/98 
KOWALSLMSOCOSSSTSTWEWMKSrvPNWKNPTPPLSPrPSEDEFItAYEPFVLPKTO 
, NAOANPPGTSTPWTENCIDDLNPUjGQPNEONNANNPGTSCSNPTSLPAPERLPETEnJS 
QFKFOGSQNNEDLIC 

CPn.02S9 302468 301476 

Oipepcidase 

VAFRCV^f^II»IHCDIiSHPHFCIUCDPAVRCSPEQIJ:*SC3CnmQQ^CAIF^ 

QNSlJTSLPNOYPDIGLLSy£EEENGSSSOKKSLSLIRSIENASAIX:DDTAPt>CTUJUCL 

IHLTKOCPIAYI^IVWKGDNRFaxn'EAPKRLSNreKVIXDIMYEIXr^rDLSKCS 

EDILDYTADKLPNIAVrASH5NFRSVU3HRRm*VDAHAKErVRRKGVICLNLW 

UIDLEKHVUlAENLGILSSrVLCSOFrirANEDENrFFNECSSAEAHPVUiOLIKRIFSKC 

KAESILSSRAEKFLKOVIVEOVNPKITDVKL 

CPa_0270 303343 302468 

ywlC-SuAS Superfamily- related Protein 

3 1 PGVIVPDKKAOITFSLPEVMSAIHOCKrVALPTDTVYGFVLSLYASEAEERLYAUCrR 
EPSKAFALYVNS I EDI ENI SGY PLS PTAKKLAOLFPGAITLWKHRNPRFPKETLAFRI V 
DHSWRSrVDHCCTLIGTSANLSEFPSALTAOEIFADFADHDLCrFDGPCSHGLESTWA 
SDPLYI YREGt.1 SRSVr ENI ACTEAKI FHRTSHAFSKHI KI YTVKNOEOLVS FLSGSLDF 
KGWCEHPKPKNFVTRLREALKKKTPS IVFIYD INTSDYPELF PFLSPYYI E 

CPn_0271 303628 304362 

Lysophospno lipase esterase 

KLWTDYSFFRRKIGNl EAI ECPGNPQDP 1 1 ILCHCYGSLADNLTFFPS ICSFSKLRPWI 
FPNGILPLENDFRGSRACFPLNVLLLOELSRLYANGVGNLOEKYDELFDVDLETPKEALE 
ELILNLNRPYNEI I ICCFSOGAILATHLVLTSONPYAGALIFAGARLFNOCWEECLKOCA 
QVPFLOSHCYEDEILPYHUJAHUroi^TKl^FVSFHGGHEIPSVVFQKMafVTVPNWI 
OPARG 

CPn.0272 305272 304340 

dnaX-OfiA Pol III Gamraa and Tau 

FNROSOAr/'ATWVMHLEEENOGWEALLRKVYHOEVPPAILLHCrTLPVLODKAEQLASEI 
tXSSSPCSEHKVSOKIHPDIVOFFPEGKGRLHSIDLPRGIKKOIYISPFEANYKIYIIHE 
ADRKTLAAI SAFLKVFEEPPKH AVII LTTAKVQRLPKTI ISRSLS I F I ERGEKI LCSKET 
F.~YLFRYACCEI PVTEVSO I IKESSETDKOVLROKVORFMEVLLELYRDRYTLNLCLKAS 
ALIJYPEHVKE I WLPIXPLOKVLLI VESACRSLNNSSSAASVLEWVAIOLVSLOYKEKEL 
V3VSPC0CLSN 

':pnj)27j 305853 305227 

rfik-Thymidylare Kinase 

. rVFTVIECCECCCKSSLAKALGDOLVAODRKVLLTREPCGCLICERLRDLILEPPHLE 
L:;ftCCECFLFLCCPAOHIOEVi:pALRDCYtVICERFHOaTr'/-rOGIAEGLCADFVADU: 
.;K'/VCPTf'FLPNFVLLLDI PAD ICL0RKHR0KVFDKFEKKPU:YHNR IREGFLGLASADP 
: :R YLVLDAP E; ILATX I OKVMLHTOLoLCT 

•'•r-ft_02V4 JUtiJt^B 105852 

'lyiA fiTIA tlyr •>::*'. Sutiunic A 

M:TII-MmKDEtrVfKNLEEEMK£:;YLRY.';M.';VC[.';PALPDtRDCU.Kr::ORRVLYAMKOL 

::[ j:t 'iAKup.KCAK I'.f iDTi^corHptKTESv I yptlvpmaonwawryplvdcc^infcs r DCO 

r I vXAMP/n"!- AI< LTIi:;/\MYLM EOLOKDTVD r VPNYDETKHEPWFPS'KFPNLLCNGSSn I A 
'/'IMATTI I CI 'HNU;a. I EATLLLLANPOAf;VDEILOVMPGPOFFTr,C I ICCCEGIRSAYTT 
' IH' :K I lO/P AKUIVF.KNEDKHRE.': I E ITEHPYNVNK:;PL I EO EANLVNEKTLAC ISDVROE 
: :bKf/ ; t r- WLE l KKf;F.C:;E E t tNRLYKFTDVOVTPnwiMLALDKNLPRTMS IHRMIIZAWI 
HMHKE'/ 1 Pr'f'.TnYr.LHKAETRAHVUECYLKAL:;CE.DALVKT rPEGf^NKEHAKERI tESFC 



FTEPOAt-MLELRLY'SLr::: . ICKEYRELUOCIAY-^KCVLJCECL-.TS: IRNEU^CL 
LKHHKVARRTTtEF3ACDERDrE0t ITNEJVI ITISGODYVKRMPVKVFKEORROCHCVT 

H r\^DEEKVMLmLC>i\VR FPHEKVUPMCRTARCVRCVSLKNmJKWSCO rVTOJQSV 
L IVCOOGPCK.RSLraFP ETrmCC^'CNJTlS I LINERNCNVliCA r PVTDKDS r LUISSTO 
IR ENM0OVA^*;^CR3T7T/PL*yHUCECCALVSMEKt^SNE^^^D^/LSCSEEECS^n:S^ 
3*0 W«r 

I ::_ 

r,-rV \* ■ . =• 

l- M DPKEKN"/ ^* A I r A V H tR HJM Y ICETTC I rJUit ( L'-TUVVU w : OEAM^JVC J 

RIDVRI LECCCrV IVDNCRG I P I EVHERESAKOCREVSALEWLTVLKAOCKrDICDSVKV 

SGGLHCVCVSCVNAt^EKLVATVFKDKKCYOMEFSRGI PVTPLOYVSVSO^^ 

OPKrFSTCTFDRSIUOCRLRnJVrumcrrrVTEDDRDVSFOWFFyBOCIOSr^^ 

0NKESLFSEPrYICGTRVCDDCEIEFEAAU3WNSCYSELWSYAfNIPTR0OGTKLTCFS 

TALTRVINTYIKAK^aJVKNNKLALTGEDrRBCLTAVIS^^CV^WO^KJ^^ 

SVAOOVVCEALTIFFEENPOIARMIVDKVFVAAOAREAAKICARELTLRKSAUJSAJU^ 

LIDCLEKDPEKCE2<YI'/ECDSACGSAKOGRDRRFOAILPIRGKILNVEKARLOKIFONOE 

IGTZIAALSCG:CACNFT^KIJtYRAIII^rrOAI3VDCSHIRTUI.TFFY^^ 

VYIAOPPLYXVSKKKDFRYILSEKEMDSYUifljCTNESSItJKSTERn^ 

ILOVESFINTLEKKAI PFSEFLEHYKBCrGYPLYYLAPATGMOOGRYLYSDEElCEEALAQ 

EETHKFKI I ELYKVAVFVDIONOLKEYCUJISSYL 1 POKNEl VIGNEDSPSCNYSCYTLE 

EVINYLKNlXRKCIEIORYKGLGEMiADOI^aTmWPEQRTLIHVSIJCDAV'^^ 

MGECVPPRREFIESHALS ZR INNLDI 

CPn-_0276 311140 310793 

CT191 hypothetical protein 

DMFIJauaa^GGSQV0NlCRTASPIKHAJa^YU^^miOELOKIMAARPHIJA 
KGKS0A2GFRDHILLVKVYNSSLYALIJCQTPONDLIMSLY0VASHV0IREI0FLLG 

CPiv_0277 312003 3U404 

Mo robust homolog present in Genebank/EKBL as of II/7/98 

NISIFYPKYFIECKEVLIKNLPPLrFVCVILMIINVRAPArGITSVOOrSTNFOAAIPIL 

NIVICCSRISSTYAEDIEEVAOEKLEKCTHSKSCTSVNl^/AHRVRCVVEIUSOCrVILAL 

EITALVU3VIIKLIKCLlDVti:VCLFCLGVCWAIIGAlAFCVVVV^^ 

PIEVKTLISPDKPYPTWYV 

CPru0278 312884 312060 

* conserved outer membrane lipoprotein 

RDSMKKKl^LLVCLIFVLSSCHKEnAONKIRIVASPTPKAElXZSWEEAiaJLCIKIJC 
PVDDfYRI PfniLLIIJK0VDANYF0H0AFU3DECERYIXKCELWIAKVHLETO 
SU3lLKS0KKLTIAIP\mRTNA0RALKIXE£CCLIVCKCPAm/M^ 
I£VSAPU.VCSLPD\a3AAVIPGNrAIAAm^PKKDSLCLEDLSVSKVTNLWIRSEDW 
PKMIKLQKLFQSPSVQHFFDTKYHGNILT>frODNG 

CPIU0279 313546 312875 

* Possible ABC Transporter Permease Proceln 

KKIM0SDLI0IUJCETWm.YKVSTArFFSCAIOGMLCI/;UXrrSPKSLNPiaCSLYAT 

MII^FLTAIPFAILrVIIJ'PITRWIVCTSLCPTASIVPLTICAIPrwrvVnAFWISAL 

NYLESAVALCIPKRNIU^IIXPESYPOLIFSLKSLVWLISCSTLACrVOQQCafiOL^ 

QYCYYRFEWSVTTSVLVrrLVLIESVRILCDFWGRRVIJCYRCIL 

CPn_0280 314593 313550 

dppF-Dlpeptide Transporter ATPase 

IKCEAWLVSE0HSPIISV0D\raKKLCDHIUiKVSFSVYPCE^^rVCH9GSCKTTIiRC 
U3FLI»lPTSGSISVACFDNSLPT0KrSRRNFSKICVAYIS0NYGLFSSKTVFEKIAYPUlI 
HHSEMSKSEVEEOVYDTtWTia,YHRHI]AYPGNLSGGOKOICVAIARAIVCOPEVVLCOEI 
TSALDPKSTEMI lERLLQLNOERG ITLVLVSHErDWKKXCSHVLVHHOGAVEELCnTEE 
LFI/iSENSITTiELFHEDINIAALSSCYFAEDREEVUU^SKEtAIOGI ISKVIQTCLVS 
INILSCNINLFRKSPMGFLI rVLBCEVEORKKAKELLIELCWIKEFY 

CPn_0281 315033 316103 

-dhnA-Predicted 1,6-Fructose Biphosphate Aldolase (dflhydrin 
family) - 

rSLRRHTWLNIHDILCNDDEIJlXSYOCKHITKDKLTLPSHDFVDKVFCLSDRWffiVLRS 
l/ymFSHGRLANSGYI^ ILPVDQGIEHSAGASFAINPI YFDPENIVKLA: ESCCSAVAST 
YGTl^l^RKYAHKIPFMLmiHNEtXSYPTKYHOIFFTOVEAAYSMCAVAVCATVYF^ 
ETSNEEIVAVS^IAFAKARSLCLAT^rt.W::YU^NPAFVA^CVDYHTAAOLTCQA^»^^ 
ADIVKOKLPTCOGCFKAINFGKTDERVYSELSSNHPIOt-CRYOVLNSYCGKVCLINSCCP 

sgkndfteaartavinkracgmglilcrkaforplseciqllnlvodiyldpnitia 

CPn_02B2 316084 317529 

xasA/gadC-Araino Acid Transporter 

r L IWSLNFSKKVFMHSHSKPTKPI/n'FTVCMLSIJVWI 3LRNLPLTAKHCLSTLFFYCL 
AV rCFM I PYALI SAEl^FKPa:iY IWARDALCKWVCFFAIWMOWFHNmVYPAVIAFI A 
ST IVYK INPEtJWNK\nf t ATVILAGFWILTFFNFLG ITS3ALFSS ICVI ICTLI PCVILV 
SLALTW r FSCNP t A ISLSWGNLE^PNFSNVSSI^VlXACHLUa^LEANANl-ASDMVNPRK 
NYPKAVF ICA lATtT! UXGSLSI AI V IPKEEISLVSGL'/KTrTLFFOKYNLSWHTCIW 
VMTI AG3LCEUJAWMFAGTKGLFI STONDCLPRLFKKVt rSKNVPTNLKLFOC IVVTIFTL 
LFLCLOSADLVYW I LTALSVOMYLAMY ICLFLAGP I LR I KEPRAORLYSVPGKFLC ICTM 
5 1 U3H.SCAFALWVSFLPPRELAQ ISECSKIGYTTFLLLAFSLNCtI PFC lYFTHKRtSK 
KS 

CPn_0283 313581 317532 

No robuGt homoU^a present in Cenebantc/ErffiL as ot U/T/og 
CRRLGYF0DLIKNAVAKII3FRKSPPNPVKLLIKFAKKCLKNSS£APLVEVLLEILEAPC 
EEILEVLFSLOPMWLKSMLOPKKHSTLCIEISSETAET lESCSLCL IS INI*LLSCLCLRS 
SHDRCOAVK t tOOFCPCFSiJEEVONFVEORN E LTPFUIHLFECDEVALLNOLnLRUM.IV 
PNALYPEPDP JCH::^ I NSECCAKDAEOOOEDFHKTKEACKEGUKKLVLPALS ITS t PQU* 
RARRFKOCAE I LMA EPRKKHKONPF I FLEALLE::EEr" E .T/nKYLKU>ir IIILWDKLLIIA 
t YET.YFCrnJL ICC* 'F: I ETFi^RRANI^PEAFQAA rOOnf'I.L.-FLFr'KMU^ 

Ct'nJi'A'i^ U*»l>S4 tlH«)5l 

yuMtt f PArovr-v i \r.l^\•u^rr::r/r:l:;LK::::LUl' m:. i e*a r ( jv r ATtK*:vi,YF»\;t i:; 
VI rrp/i /'MLiiu :vi *vAYi .r/tjtjr.r. i py.rrvrr. i rr.r: :vpr: ipepi a* ;r eeo: : 
v::a tDELr.KNFr»Ai)iiKKRrKMi.fY:;nFii)E'/;fM'NKi;i'K£o::H'p.':K 1 1. 
5>0^( 

Nri rf*r^isr homol.st pi.*r:t-f,i i„ t;,.ar.uni»!/KMr*l. .tr. *u »i/7/'iH 

Ki-:cniLFFFT/\NKrpr^::i I w . ( /uvt V :v: :i ■ : f.-/r i un jya : :vt4a .i * ^a/kaim v iivr. 
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AAPLCLL'-V3CAAJVC3MKArV3LMCLrKCCKP* JNEEKIDPTKDLEIKOPESLKPV 
PVEXX35LrKERKT«/3nCAK I PS rVEDDFKPYVrOS-rFYHONKVYSKPIAERMQSLEKEIT 
TL I VDFPRA L EESSJCaSCSU^CVrSE I KNLrLPRFL3RKVr/3LTACLi»JU>CS IVEEYA 
SSOU-ILLLTXPEPLNMVTOOLIAHUJSUCTEXRKLTPHMOKLVLSrNFVfrfW 
lEKrVAYDPNLOTOELKAHUy^IVOFXLSFOSSEMORETRALFPSDAOELPSAKDGSN 
WPAINSSEY^frDFKDL.3VTJCKSL3ERlJU^CEK^PSPSSW^^FT3SVASH•nCDFSLLFT^F 
5N00SVI LONPFLL: ELLHEKPKCOTFUCCtXEKAMPMSNWAALFRPMLMGHLCSCIARK 
KELK r t AEH t/TVPFKEITOA I ASCK ILOLLLOHLFDF 

mgcE-mj** Transporcer (CBS Domain) 

SCR£SKCKIMVCEC^mNEEKU)TAFSSO^U<DSRTSHU}DEl^FKlXKA^ 

DLSKIV r EVNP r DLAYAVSCLPSESRAILYXNLSC ITAKVAf I INTDSASRWAI FRRLSD 

SEVCALrE0MPPDEAVWVU3DIPDIUlYRRILELIDSKKAlXIRDWiOiGRNTACI^^ 

FFAFUlETTVKWSACIRSNPGIDLTRLVFVLDFKCELOCWrDRSLIINPPEMSLKOIM 

NO I EHJC\a.POATREEVVDLVERYKr AALPVVDEDJFLICAITyEDVVEAI EDIADETIAR 

MACTTEDVCYTTaiVVORFLIJUPWIXVTLFACLISASVMAYFOKISPAU^ 

INCMSCSNVCVOCSTILVRSKATCTLSFCRRRETIFKEMS IGLLTCWLC ILCCLWYLMC 

FI/;unFSOaC10U;WATC;VLCASLTATTU:;Vl^PFFFAK^ 

IMSMI IFFLI ACCINFLFFW 

CPn_0287 324230 322089 

No robusc homolog present in Genebank/DfflL as ot H/7/98 

RilCMI RS PLPFI SSKRAUIMI^LODEFSCPEDVVDFLFSEIEUAOODEPSEGYLALSRS 

UiWTHNHPKVVKRVIFYCVSYCUaiKSMSIFICA^TYIDFLFEKMISASDRl^LCSA^ 

TC IN^ELYSOTGEMCFLSEVVD^ffm.I EQLLKMHPQLlOJRLCWEHFRrC 

ASVYOAVCRSFIELYHKHLELSDIJUXKKCIJaJ^IIJLSPNNAHIHADYAKCL^^ 

KSU.IERG«EHFSKAIFt^FSRIXa7rtAYOfrrRYSYALASVKLFtJLTYKlC£HF150AMNIL 

YQTVOAFPNLSGl-WMVVCEU-IRSGWUJSNMKYIEVCLEKlJ^LOKKTTro 

GIArLCLYLEEPNLFKDSRHRLISAMRTFPGKSALVHALGWOLCSALYFNEDSHFASAI 

SCF0SCLEWOLDATt»(KJKI^DAyFSWGIKKKSARIXWCAVDVASRirSLJ^ 

RGLALKCLA£AT IDEAYKEIFLSESUifYORAWDLSCRLEILELMCOSKYLLAELQOSLF 

HYD£AVTI^TKVDLTI^SSRVKLIIJ^AVUXa(GRI^DTPPA£EARET^■FPLVE^nrt^ 

NFLLlIjCKVYlJ'LmNXNVa^KlJUaTtEKATSLCCPE^ 

MVIRSAOYGVRITEAKWLNDPYIJ^^aJlErKAFREVVINQKCRLWU:^^ 

CPn_0288 325785 324571 

CT288 hypochecicai protein 

ISITIREFLFFCFECTl^KFYWI^lSCF^rt<TSTNESIJlPISPKAS^PK0CWOSYFRSAL^ 

HRSC3r^LSVSVCKV^nCYr^A^^^VRLTVIALAVVGVLI LFS IMLAS rOCTLVrrSWPLVTAA 

ILI PTI LLTOGHYILHRLCKXVDVISCVC IPPFSRRCWVP ISSSffTLEKFDEKHVSACSY 

L0rSTLSArX3SGrAAVYCX:PPLLFR\FPCFGIPCAMPIVALUWrYNLIRrLVVPFYIIF 

RMrYEHFFCKKLPEDDRFIYKDVAR£M»SlJy^FlJCAPFYASACMICU^SUJ)PI^ 

U^CSVERIMIDWIIARSVSLANEAHStXRFEOGCKRKGLCWHAFYL^ 

KCEIVSGAHPSIOLPEWlCLm'SGRYPHISVIPDSGNDSAKNrrV 

CPn_0289 325797 326996 

CT289 hypothetical protein 

NFNRUOKORSHYKKNNLLLLLSILVGlXILCSVQSPWrVVSAECIW 

VFCALGSTITSIONFTmfm/SKRILYYTLLTTVlAASIGIXIJTUAPOMITOI^^ 

TKCNPLCYU3VLStm.PENIFKPFL0CaJVISAACIAVLLC^ 

FFSIFIJ4U«lOGUCII.PIAha/3rSVILFKELKTOSfa™rAEYU/^ 

rUJCINKVSPUCVAKAMSPALVTATFSKSSAATLPLTMEIAEDDUCimQJLSRrSFPLCS 

VlWecCAAFILITVLFVATSNGMriSPI>iSU;wiFIATIJ^ICNACVPK^^ 

TSMNVPLSILCLILPFYTVIDMrETSDJVWStXTCWSLM* 

CPn_0290 327027 328523 

Na-dependent Transporcer 

RSALTMNKKHASFSSRLGFIFSMICrAVGACNIWRFPRVAAONCXXIAFLILWtCFLFLW 
I PL! 1 1 EI^IGKLTKKAPIiSAL IKTAGKKFAWAGGFITLVrrCI tAYYSTIVtWGl^^ 
YAVSGKIHr/»JDFAKl>iTSHYOSSrPLWAHLTSLGLAYLVIRKGIVHGIEKCNKILIPAF 
FLCTIAmJlAVTLPGAVCX:iKOU='SCDKSCrSrmCWIEALTONAWim;AC^ 
GFASKKTGWSNGALTAICNNLVSLI>G 1 1 r FSTCASOJILGTrOUJDGACASSIGITFI 
YLPEIJTRLPGG lYLTTI^SS I FFIJVFSMAAI^SMI SMLFIXSOTLAETCIKPYISETLA 
TI:APVLGIPSAr^LTFFS^^ODTVWCVALrV^raLIFIYAALVYGFPKlJCKEVINAAPCDL 
RUWAFDYI IKYLiP I ECI UXGWYFYECLFPEr«X>WNPISLYSU;SLVLOWSL^^ 
WKFNXOLYLRFSRYNHEIL 

CPn„0291 328658 329194 

incB-Inclusion Membrane Protein B 

EKHMSAPI PTPOELSDOITCLWVOYOOVS ELARENKGDI ECLKTLTAALTADACIOPSAD 
EIYSLOTAAALILSASEKPGSGPSGSTECSVTVQSPCKFKfCVLAVVLTIIALIAIAVLIA 
CI lAACCGFPLLLSALNLYTICACVSLPI I ASTSVALICLCTFVANSLI KPVITVRTTR 

CPn.02g2 329201 329836 

incC- Indus ion Membrane Protein C 

VK>mCISDFMTSPIPFOSSGDASFUVE0PQOLPSTSESOL\n'0LLTMMKHTOALSETVLO 
OORDRLPTAS I ILOVCGAPTCGAGAPFOPGPADDHHHP I PPPWPAOIETEITTIRSELO 
LMRSTU:OSTKGARTCVa.VVTAILXriSLtJ^I 1 1 r ILAVLGFTGVLPOVALLMOGETNLI 
WAMVSG3I rCFIAUICTLCLILTNKNTPLPAS 

CPn_029) 329940 332723 

CT234 hypothetical protein 

VWSHORVLRLLFNLHHGEEKRAFLFFLLCLVWCrCCYGTLSLAECLFrEKLGSAELPKIY 
CC33LILCVLSSI.I LYNLFKKH I SATALFLIPVSLS I LCNFYLI LSS I FA r DPPRSPLFF 
YftlV tWSLTILSVTSFVCFVtXJFFKLODCXRHFCIFNAI I FLCDAIGSG 1 1 ASLVHTIG I 
QGaiLFTAALVLTFPIVFYVSKSLKSLSDDHDLFIDTCHPPPLSKALKLCr^DKYTFYL 
UrrrFLMOLLA I ATEFNYLK IFEIOFASKEEFELVAH ICKCSLWISLCNMCFALFAYSR I 
VKRU;VNNraFAPU:FU;LFLFWrPKTTl^IAVUWWRBCVTrALDD^WLOU.IYCVP 
NKrRHfj[RtWESFtErt(:MLVW3LrCFL^*G00YVFCLirSLIATILVCLVR.TnrAKArL 
KNLT^AWLTRSMOCWCK.'IWrVKOKROVELFLL^HLKftPSERHCTFAFOHl-LNLASRSV 
LP::t.LAIfMNKL3LFNKLKT r EMVKSSLWAKDFLTLELLK RWTi* f F PH PA r ACA IHLYFAE 
IIOLUt [TIf r AEOLVCm/CDRLLAA I LTVRROEArCPYnDLADKRLKELLNCDOPEDI VKC 
LT I LKLEKNPONFP U-LDFLm-KNEDl L IVrCKAUITriVRANIIKTYCrELLKPLftOCSHN 
r)EA:;OYl.LKT I t ALU I : :f VKDLL^f^T:H;UC^^•JRK Y AtTVM U:KLDK EVAPAFLOVLTDE 
' rrHNRCR [LAAKALCK [UNWLLKKItAYK rVK;;KAJKALFYnyiU-;riY lOKKYPTWUXLA 
^f^[/J^7JYYA^:VNFMU:LIx; l UISHEHSCVLtRALTCKNOK rKAOALESLEWICDSHLFSL 
LEl-r// IOi<:Mf;YSEKYYFKC(:VtlXTLKELLNMMENf;p::t:LNKl.TAOOLKEEL3YCOPOF 

{jr.vrrr i ynoki iEDKR'rEE.';ETL icvui t 

CVttJlAiA (1J077 iasi)2 



cAMP-Oeoendenc Proc- Atnjs«.> Reou l.ir.i;rv .:tibijr.i: 

tRNFFwiLt DRAFLLKKT : r>vSLCMDU.LT : ADfcTET : : rKrncK\.T:; iGOrcFsnrr : 

VECY IT I EKLESPLNUCS)UX:F^£K:U*«TJ^WPAE*aiAU«VC^^^ 
ECPSVAL3FL£LVAKCilKFttE»'- '' I * f ii 

CPn_0295 33386ft ni-iZ? 

acpP-Ac-yl Oirrier Protein 

AMStXDDVrA: r/ECLGVDPKEVNENSSF lEDLNAOCLCLTELIKTLEEKFAFEISEEDA 



CT296 hypothetical protein 

KIPIRGMICKDITLVGKKVrVTOCSRCICaSIVKLFUICADVEIWCU^^ 
TGUX3E^^FARVDVSH^OCfVKIX:V0KFU^KHNK IDI LVW>W^ 

ISTOLTSLYYTCSSVIRHMI KARSGS I INVAS I VAX ICS AGOTNYXAAKASI lATTKSLA 
KEVAARNI RVNCLAPGF I ETCMTSVUJDNLKAEHLKS I PLGRACTPEDVARVALFtASQL 
SSYKTAOTLWDSCLTY 

CPru0297 335724 334774 

CabD-Malonyl Acyl Carrier Transcyclase 

SRSNKOOrffKXKRYAFLFPCCKSQYVGMGOOLYMEYPEVRElFDFANERLCFSLTSIMFX 

GPEDUJCnrVHSOtAIYLHSMAWKVLSORSSIQPSLVSGI^LCEVTALVASORISVLCX; 

LELVRKRGOLMNEACNOSPGAHAALLCLPSEVtEENITSLCOGIWrANYHAPKOLWACI 

AEKVDOAIELFRDLGCKJCAVRLKVSGAnrrPLMC^^AODClAPOIYALCMKDSSLPLVSHV 

VGKSLVNTEEMR£CLAi«3m*SPTLWYQSCYHIESEVDEFL£LCPCKVlJ^^^ 

ITSLQTFAO I EKFLSEV 

CPn_0298 336742 335717 

CabH-Oxoacyl Carrier Protein Synthase III 
YTSrFLYMWFSWKNKKAAIWATGSYLPEKVLSriADLEKMVOT 

GPQ£nrrSIJ1GAIAAEKAIA^lAGL^KD0IIX:IIFSTAAPDVIFPSSGALWAKLCIEOVPT 

FDC0AACTCYLYCLSVAKAYVESCTYNKVLLIAADKj;^Sr\n3YTDRimn/^^ 

IGESRPCSLEINRI^tJGAIXSKLGm^LPAGCSRCPASKCTLOSGXHriAMECKEVai^ 

\mKMrrAAKHSIALACIOEEDIWr^^OANERlIDAIAKRFEID£SRVrK^^ 

ASSVGIAU}a.VHTESIKl£OYLIXVAFOGGLSWGAVVIJCOV 

CPn.0299 336726 337415 

recR-Reconbinaclon Protein 

RKKLVYYSESLYSNIin>CPRPECKNKIHITKrRYPDYWKLIFFIJlKIJ<;iGrKrAEI^ 
FELISWDSEOUCILGNAFHNVASERSHCPLCFTLKESKEADCHFCREERDNOSLCIVASP 
KDVFFLERSKVFKGRYHVT^SU^PITGKHI ENERI^ I UCSR im^TPKEIILAlDATI^ 
GDATALFLXOELQHFSVNISRIALCLPIGI^FOYVDSGTLARAfSGRHSY 

CPn_0300 337768 340152 

ya«T-CBip85 Analog 

GRUflKLIMRNKVILQISILALrOTPLTLFSTEKVKBCHVVVDSITIITBCSaSNra 

PKUCTRSGAI^SOLDFDEDLRIIAKEYDSVEPKVEFSEXSKTNXALHLIAKPS^ 

NQVVPCHKILKTLOIYRNDIJXREKFXJCGLODUITYYLK^ 

DVLIKHJDSPCCKIKOLTFSGI SRSEKSDIOEF IQfrKOHSTTTSWFTCACLyHPDIVBOO 

SLJaiTTYLKNNGYAISAIVNSKyDLOOKCNZLLYTffilDRCSRYTLCKVHIOGFE^^ 

EK0SaVGPNDLyCPDKlWtXy^HKIK0TYAXYCVirmiVt3VlJ'IPKATRPI^^ 

SPYKVCLIKITCmirrKSDWLHCTSt^PGtn'FNRLKLEDTEORU«TO^ 

SOLDPMGNAICYRDir^^EVKCmGmiGtFLGrSSLCSaraSIELSESNF^^ 

KCFRCLRCX;GEHLFLXANFG0KVTDYTLXWTKPHFL^rrPVaLCICL0X5I^^ 

QTTOGNVSTTYILNEHLKYCLFYRGSOTSUiEKRKFUjGPNIDSNKG^ 

VDSPRTPTTGIRCCWTFEVSCUXnTHrrKI^UJSSIYRKLTRKGIUCriaaAOrim 

NTTABGVPVSERFrtCGrrrVRGVXSFriGPKYSATEPOGCLSSLLlSEETOTLlllOPN 

ISAIVrUJSCFVGLOEYKISIJCDLRSSACFGLRFOV>eJNVP\mLGFGWP 

lOVSORFTFALOGMF 

CPa.0301 340163 340762 

(OiBpH-Like Outer Kenbrane Protein) 

IKDLSKEI FWFRKCrWYPFSI PKLVOVIbaCKLLFSTFLLVLCSTSAAHAfaCYVNUCRC 
LEESDLCKKETEELEAMKOOFVICNAEKrEEELTSIYNKWDEDYMESUOSASEEtJ^^ 
EDl^EYNAYOSOYYOS INOS^^«R lOKLIOEVKIAAESVRSKEKLEAILOTEAVLAIXP 
GrOKTTEI lAILNESFKKON 

CPn.0302 . 340766 341866 

IpxD-UOP Glucosamine N-Acyltrans£erase 

SKnCEFSMSEAPVYTLKOLAEXXOVEVOGNIETPISCVEDISOAOPHHIAFLDNEICYSSF 
UCNTKAGAI ILSRSOAMOHAHLKKNFL ITNESPSLTFOKC lELFI EPVTSCFPCIHPTAV 
IHPTARIEKNVTIEPYWISOHAHIGSDTYIGACSVIGAHSVLCANCLIHPKWIRERVL 
MGNRVWOPGAVLOSCGFGYrTNAFGHMKPUCHLCYVIVCDD\^ICANrPrDRCRFKWrV 
IHBCTKIDNQVOVAHHVEIGKHSI rVAOAGIAGSTKIGEHVI lOCOTGITGHISIADHVI 
MIAQTCVTKS ITSPG lYGGAPARPYOETHRLI AK IRNLPKTEERLSKLEKC3VR0LSTPSL 
AEIP5EI 

CPn_0303 342992 341921 

CT303 hypothetical protein 

REOKGLHHMOVSRK INRHTOFYVDS I DCV IKNFDHKPSEDKSRDHEELEEKLLTITKRIV 
ASAOEFQNRKT03KNYYLKKTOWLPFKNEELEC3TKELFAMLTSMDKKIAOLFFYSPOCSS 
CWVEFTEVXCHLNDS IGLCG\^U:CCLFEOCX: EK\AnVNKKLDLPLLUrnV^ 
TYRNI SLLNC03MSELCKELCDVUC0HCVAFTL I FKEI VDI DLLNYVKL lOGUCRSGNtO 
AR lYDNDVPTLPSVSSSP I ALRYSUANTIRCLALHVDFSSLKF I SPS I tSNTEHTAKALN 
3GGECF r FSNLDEFNLCMK I VMOLLRTCK [SPE t LNKN IMK I LMI KRRVRSLY I 

CPn.0i04 .U1091 144 ISH 

pfitiAAxlpA-Pynivarp (Mhyclroa*»n.in»? Alphi 

DOKPLPKPLPYKKVMDnSAPYNt Ar;(yrrEK.'rrVER t U)LYGPA:-.CI KFLKOMVLIREFEA 
RCEEAYLEC LVt'^^FYI ir lYAl'OEAVATAA I AfTF ;LDPWVF.';.';YPC-| rAU\ X LLW I PLQEZAA 
ELLTKET'rALi^RtTGSMHMiT^PHFWy :FX ',l\/r-/yj[ PLAMIAAFT r KYOEOKNRVSLCFIC 

ir^vAi.' ryfi iFTLNFv:;i jioLi'LMt. 1 1 fjm Mtitv rrnt /jravakoi* i aej.wtsydirav 
•ivNf ii-fjf .KN::i.i*:rRKAYiivMvt/rF-':ivr.vw;t ji ;::hFR( -mhi tioi^m :miKF.a¥jciSKK 
ot' tvr ^KiA^L I Ri.FVUTREEPyH r rof/:ktavi ,RAF::NAKL.';r:Dr::vTTf .EprnrrA 

rjilhU/'yJDlt lyr iiv.ir*> Ouhy'Jrty|*rti.i::f (n.r.i 

KKECMt-Ki irri .K I HiCAt,nbV\ [riFEM;;i'Or'Nv • 1 1 /;Er> :r>YrK:AYKVTKf :u.nKw:pKRV 
I uAP i:;hJVAf i : r ( : [( :aau i ii.u r- [ [ ^:^■M: :wr ii-::iVAi j )y t ( ;:i ^UKMiiiKn i ikf.wp t 
VKiKirtr :AA*v.»v:x.vii:nH*vFj:i.YAN r n a . i n Ai':;f if •YnAK'':i,i .k; :a i mnnni vi.flen 
Ki,fYNr.K':f:vi*PEfr/i.vPc*:KAiiftvof7;MtirT i it7::i*mv:: rTKJ---\i*::i jvkkr*#:l:;i£I 
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[DUrrrKr-LDCCTILSCVTlKTCRCrVIEECHYF* jEI r ALITEHVroSLDAPPLRVC 
0KETPMPY;;K tLEOATLPNVNRtLDrriEKVMP 

CPn,03n(i )45UtJ 346431 

pdhC-DihyfJrol ipoamide Acetylcransferase 

CKFVI 3LU(MPK LS PTMD/CT rVKV«KK3NIX>\^FCDVIVEI STOKAI LE>r^ 

E r LJIHECEK : V loTP T AVt^EANEPFNLEEU.PKTEPSNL£A3PKC33EEVSPATTPQA 

A3ATrrAVTFK PEPPL J3 PLVRKHVGTTVNIJPUUlOlAKEKNr DVS3 lOGSCPCKRI^ 

■^r : .:"•.*,;•; !■• * : .v -.i - :" ■ ■ ■. ■! ■'.■'^'^'.rntKFiit.r' ' ~ ^/ r.vNpr.'. «v. - ; : r'rp":r'P'X'/':' 
■; :.:,M.i.:-*;:t^.A..''::K:..'::r • -.-vpa' v»:-\fyr::T.- irinr ;!•?:: r/r^ryr/prrrr r;:.;r A 

VAXPa;iITPIIRCAIJRI^LC«I3A£lKSLALKARNO;5UJDTrfKCjCJrr/SNLCKrc:T 
ErrAXVNPPOAAIlJVVCS\^EOAI.VUX;EITICSTCNLTl.SVDHRVIDC-rPAAMFt!K^ 
KILEAPAVLU-N 

CPrv_0307 348998 346515 

glgP-Clycogen Phospnoryiase 

NTCIVEDFSSFDKNKVSVDSKKIUIUDRLYLSVVOSPESASPRDirrAVAKTVMEW^^ 
WUOXJNGYYKNDVKRVYYLSMETIXCRSUCSNliNt^ 

SDACLCaO3L£;RLAACYtJ)SMATlJV\rt>AYGYGIRY0YGIfTORrVNCV0EXAPDEWLR^ 

NPWEICRCEYLYPV^^^YGRVIHYTOSRCKOVADLVDTOEVIJ^MAYDrPIPCPfCNr^VN 

RLWQAQSPRCFEFSYFNHGWIOAIEDrALIENISRVLYPNDSITECCJELRLKOErFLVS 

ATrODIIRftVTKTH IClIJNUADKVWOI-NITniPAU; lAElWHILVDREELPWDK^^ 

VIFNYT^^r^ILPEALERWPLDU•SKLLPRHt£^IYEINSRWtJ^CVCSRYPKNDDK^ 

•IVEBCWJKRINKANUlVVGSAJCWSVSSrHSQLIKimJTCEFYErrP 

awiAirNPRLSKLLNrriGORYIIDLSHLSLIRSFAEDSOTimWKCT/KU^^ 

YNEMIErVDPNStJTCHIWIHEYKROLKNIUlVrYVYNDLKENPNODVV^^ 

APGYVKAKL IIKI-INSVADWNODSRVriDKLKVlJXPNYRVSMAEH I IPGTDLSEOISTA 

CMEASCTGNMKFAI^ALTICTTMrxy^IEMAEHICKEWFrFClXEEOrVQU^^ 

rCDWfPKIROVLDlXEOGFFNSKDKDlJTCPIVHRUJiEGDPrFVLADE^ 

LFKEPDSWTKIS IVNTAG«GFrSSDRAIQDYAW)IWKVprKSCSCEGN 

CPn_0308 349213 349596 

NO robusc homolo9 presenc in Genebank/EMBL as of 11/7/96 
FTTOENNKAWAOTPOTTOPOPSVSHKATHRYCSWVFFKPILVSlJSUiASLT^ 
5GVTI^LCXCIVLAIOIVIJ^IALVIJ^nWIR0FKQARTA£UtSKKMI5M 
LEDRYSSK 

CPn.0309 350977 349595 

CT309 hypothetical protein 

FMRAWEEFLUOEKEICrrrnVDKWLRSrJO^rnACNLYI^^ 
SGLVNNNNKPIRVKVTSVDKAAPrnCEKQMQOEKTAYTTKHYCSVNPEI^ 
DLPFRVLQEFTICSPDENO G V'i l fff mJCPECSGKTHIJlQSAISVLRESQGKILYVSSDL 
FTEHLVSAIRSGEMQKFTtSgTRKIDALFIEDIEVrSCKSATQ m V KYt WSLHSESKLIV 
VSSSYAPVDLVAVEDRLISRFEWCVAIPIHPLVOECUlSFlifflQVniLSIRIOETALDrL 
lYAI^SlAnCTIiiiALNU-AKRVKYKKLSHOLLYEDDVK^^ 

NVAOYYCVSOES ILGRSQSREYVLPRQVAMYrCROICLSLSYVRIGDVrSPCHSTVI SS IR 
LIEQKIEENSHDIHMAIQDISKNLNSUOCSLEFFPSEEHII 

CPrv.0310 353472 351049 

60lM-60kDa Inner Membrane Protein 

YFDLLSLIFRVYQWnCRTtXTTSLIGIArVGCQrFFCJYNEFIlSCKNIA^ 
AVESVCLSVASWOTVfCEEHKM^YAVRVCDKmXHNSEAAOSW 
FDMIHIALYRQ0CSSFNPTNTCKVrLPT7WiX;LPVLVVEFRNNKEPLVFLGEYAQGRrSN 
KDSTIFCTALVFWRSGSDYIPLCLYDSREEKLVSLDLPITRAVIFGNDODSAKSSCTANH 
YVU^YMOI IVSEESCSI EGINlJFASTNNKSIVNEIGrDRDIJ^EKSPEALFPCLSSK 
LPreOQAXNSICGYYPLIJlRGLI^SKKr^PLEYKAI^rn^SGRDJVTPVAUlYRVL^ 
HS lOLESLDRSVOKVYKLPENPEEKPYVFETAITLTKETEDVWVTSCVPEVEIMSN^ 
TIKYRVIKKNKGSUSKVKLPKVKEPIAIRRGVYPQWrUtSrKrm;: ILTPl^EIASCVCS 
LYISCSTAPTRLSMSPKNOLYPVSKYPGYETU-PLPKDACTHRFLWACPtAEPTUCVL 
DKTITOEICEWEYLDSISFRCVrArmPFAAIXFirMKFFKLVTCSWGISIILLTVrL 
KLLLYPLKAWS IRSMRRMQ ILS ?YIOOICMKY^C^rePKRAOMEIK:LYlCTT^CVNPITCCLP 
U.10LPFLIAMFDIXKSSFIJJIGASFI PCWrDNLTAPEVLFSWOTSIWr rCNEFHLLPIL 
LCIVMFt^KVTSUOCKGPVTDC»KQ0OVhCaiMKAILFTAMFYTirPSGl^ 
WOCWITNKI U>SKHLKNEVVUINKKHR 

CPn_031l 354453 353575 

CT311 hypothetical protein 

r»lRAQ4AVIYVroRSKIVWSFEPWSLRLTWYGVFrrVGIFIACLSARYtJa.S^ 

FSKSOLRVALENFFIYSILFIVPGARlAYVirYGMSFYLQHPEEIIOIWHGCLSSHGGVI. 

GFU>rAAIFSWrYKKKISKLTFIJT.TOircSVFCIAAFFIRLCKFW^EIVCTPTStJW 

\A^SDPHOCVOCVPWPVOLYECISYLWSCILYrLSYKRYLHLCKCrrVTSIACISVAFI 

RFFAEYVKSHOCKVLAEDCLLTICOILSIPLFLrcVALLIICSLKARRHRSHI 

CPru0312 354518 354976 

CTlOl hypothetical protein 

CTMARNIKYFLI LFPC I LWI SAG^«UXJCATA lALDPLSSFFTYCLLSMVSWCLASLKHR 
■fLLSJCTIRKOLSLSSEFFSOKITWIAYIKQTFISRRFLIMVIMXAFSLVLRRYISNPOAL 
PV IRATVGYALI KTA r AYFSKLONALWENPEGN 

CPn_03l3 354957 355355 

acpS-Acyl -carrier Protein Synthase 

WK ILKEI 3ANSME I IH rCTDII EI SR IREAIATHCNRLLNR I FTEAEOKYCLEiaOPI PS 
FAGRFACKEAVAKALCTC rCSWAWKOr EVFKVSHG PEVLLPSHVY AX IC ISKVI LS I SH 
CKEYATATAIALA 

0Pn.03M 356285 355353 

r.rx8-Thio redox in Reductase 

M IHSRL 1 1 IGGGPSCYTAA I Y ASRALLHPLLFBCFF3G 1 3C5COUfnT£VEKFPCFPBC I 
LCPKLMNhWKEOAVRPCTKTLAOD 1 1 SVDFSVRPF r LKSKEETYSCDAC r lATCASAKRL 
E I rCAflNOEFWOKGVTACAVCDCASP I FKNKDLYV [''/^GDSALEEAI.VLTRYOSHVYWH 
RRDKLftA:;KAMEARA0NNEKITFLWN5E [VK ISC;D.': IVRHVD IKtWTOE ITTREAACVF 
KA la mtTTDFltXXiLTLDtSCY IVTEKfrrSKTSVF^FAACDVODKVYRQAyTnACSGC 
tAAU3Arj*FU: 

'I'lJMl'. lS(;n7 (SKTli. 

;:i l(it<'r:;iMiv.a Prot».'in 
Mi'KOArrrwt :kk r(.DN i eclteovaefko[x\ta*ip iTr::;EEE::PNE lOF-'yv tucoTw 
I • I HKt>r- vwi tvf ;lk: :ei w t iw,:fs i Dn^Ef :Lvu:AE-yr:vY i-doakoef* iKV/LnnEKATR 
'jf«>*KY 1 1 M IC -i-w :: : i VK( ;y itrkvko :i. I vd i» >«fjvflpc ino t i;nkk i knludyvckvc 

KKK ri.K tWWtPN IW::RHELLEAFJI irKKAELtEO I" fflE-^RKiIVVKN ITOF^VFLOLO 

1 : 1 nf;u J( iTi mtwkm ( t<np.:RMVEi/K?ELFVi i L;;7riKEKf;RVAUu j<okfjimweo I ek 



KVPrCKRVLnKlVKEXPYC/ t EECIEC:.IH E JEMSWVXN rVDP^r-AfNKCOEVEAlV 
Lr; rOKDECK :3LGLK0TEWiP»^lE£KYP ECLHVNAEI KNLTWGAFVELEPGIBCLIH 
: .':0M5Wt KKV-IKKJELFICKGNSVE^WS UIVDK ESKK tHjCVKCLSGMPWNEI CMffMCT 
VrSCVVTKTTAFCAFTO^ftCLECn;iHVSELSDKPFAlCI€Dl rSICtJWVSMWttlkLtoOH 
KKVSLSVKEYL-yWiAYDODSRTELDFKDGQCPKERKKKCK 

CPn.03ln 3587fl4 3ti0121 

nusA-N Ur.ilization Protein A 

rri;-Kr :r' :"•/•■•• TK- :•/•■:: '.Mr-r::;-- / >-.\a£Yt:i vt-:nYMDvrr/f:nNFCR:AAH 

AARC : X JOKUlHAErtLV I YUiV HI iRVNtTLJCWKKKAKsy NL I lOLCJC/EAILPTRFV? 

KTEXHKICOKIYAU,YEVOE:SENCCAEV'rL3RSHAEFVKOLriOEVPEL£ECSVEIVKIA 

REACYRTKLAVRSSDPKrOPVCAFVCMRCSRVKNI IRELNDEKIDIVNYSPVSTELLONL 

LYPIE10KIAIL^^0^CVIAIVV^roADYATVIC>a^CI^WRLISHILDYELEV0RKSCWKL 

L£IORWIA£FT)SPHIXOPL£MECrSKLVIONIJaiACOTriRRVUASANDL^^ 

EIAYKILEOVSKYCESJCVDEKPEIED 

CPru03l7 360045 362750 

ln£B-lniciacion Factor-2 

SLLIRSLSKSANMEKVltt.TJCMJCLKrKNA0l*TICttCU5KIJC0K^ 

AKEICSVKVALAATSTPTASASOASPESTSRRIRAKNRSSFSSSEEESSAHrPVDrSEPAP 

VSIADPEPEI.EWDEVCDESPEVHPVAEVLPEOPVLPETPPOEKELZPKPVKPAEPICSW 

MIKSKFCFTCKHINHIiAKTFTCAPAKEEKWACSKSTKPVASDKTCKPGTSEOCBO^ 

KOFNPANRSPASCPKRIJAGKKNLTOFRDRSKlCSOESUCAfTGRDRYCLNBOGEED^^ 

RVYKPKKHYOEASI0RPTHIKISLPITVKDIJUl£>lKLKASEVI0KLFIHG>frYW 

SETAVOFIGI£rCCTIDIDYSEODKU:LSNrm^EIOSTDPSKLVIRSPIVAFMCHVDH 

CKTTLIDSIJlKSWAATEACArTOHhfGAFCCSTPVCDITrLOTPGHEArSAMRAIWAE^ 

DIVVLWACDECIKEQTLEArEHAKAADIAIWAINKCPKPWFWgPTTVTiQ^^ 

AWaCSTVTWrSAKTXJBCLSEliEMLALOAEVLELlCADPSARAi^ 

TVL10^CSLKLCEALV^NDCYCKVrTMHNEHN^i^KEAG PS I PVLlTCLSOl PKACaJPFF 

WKNEKTARDIIEARSACCWRFAUXIKKRPNFOSMLONmXiailllCADW 

ISKIKSEKVDVEILTOSVGEISESDIRIJU^KAVLIGFHTGIESHAEPUKSlfiVR^ 

FTVIYHMDMKEIHrStlXPIAEEKDEXSSAEIKErFRSSQVGSlYGCIVTBCimTOIHK 

KL 

CPn.0318 362704 363126 

rbCA-Rlbosoma Binding Factor A 

VMSYNVMKr^IIHWTfNLKYCWrQiRRIKRVNAUOEAIAKVILKDVK^ 

RVSLSKDLHSARVYVSVHPHENTKEEALEAXJCVSACFIAHRASKNVVtJCYFPEIJI^^ 

IFSP0DYIENI.LMaiQEK£K5 

CPa.0319 363133 363879 

tniB-tRHA Pseudouridine Synthase 

TIFFXSNLm'rKDMIMDtAVEIJCBCILLVDKPOCRrSFSLlRALTXLIGVJa 

FATGVMVMLIGJUCFTRLSDIU.FEDKEYEAIAHlJC;iTrDSYDCIX:iC^ 

LSAAEYFOGEIQOLPPMFSAKKVOCKKLYEYARKCLSIERHHSTVOV™ 

FWSCSKCTYIRSIAHELGTMLCCGAYIXQLRRIJlSGRrsrDEClDCattXDHPDrDISW 

LRDAKGHSL 

CPn.0320 363824 364783 

ribF-FAD Synthase 
TrPISIFpTYQflWtAYSI^SFSVDSVT^ 

FDSHPOTVLSLNKTKLZtrrKEERLOLLQTFP Z tMlGWT FDLNFANOSAEEFLTLLHRNL 
KCKALItCYDSCIGKEOOSKTEALCT IGKPLG lEVIKI PPYRMDNIWSSKAXROrLSAG 
NLECAHRFIjCHPYAISCKITBCSGIGGSLGFATINLPREESLIPLCVYACEXRYDSTPCQ 
CVMNlXrrAPTFGRESLYAEAHIFSFAENLYGKEVSIIPRmjlEEKKFOSKCTURAI 
OILDAQCWFAKGSFTfYESTA 

CPfu0321 365900 364767 

ychF-GTP Bindins Protein 

YSWCHVIIFIFRCI/lSKTCCCIVCLPhMSKSGLiTlALTCAOVASCWYPFCTrOP^ 

VIDERLEAIJUCISNSOKIIYAI»Ga^IAGLVXGASIXy^LGNRri^rRErHAIAHVVR 

CFDDPIJVmwSGICVNPVEDIEVTNLELIFSDrSSAKNIHSKLEiaAKGKJlEV^^^ 

TIIAHLEKCLPUm^LTPBOWAIJCPYPFLTMKPMr^IANVOESSLPDKDMDYVAAVRE 

VAAKENSKVWICVRIEEErVSLPIEERLEFLKSLCLEKSCI^RLVWUVYDTlJCUSYrT 

TCPOESRAVmrVRGSSAWEAACEIHTOIOKGFIRAEVITFEDMIBCOGRAAARELCKLHI 

BCRDYIVODCOTMLrLHN 

CPn_0322 366231 367328 

yscU-YopS Translocation Protein U 

SNijCNSMCEKTEKATPKRLRnARXKCOVAKSODFPSAVTFIVSMrrAFSLSTFFTKHLOG 
FLVSMI^OAPTRHDPVITLFYLKNCLKLILTASLPLLCAVAVVCVIVCFLIVCPTrSTEV 
FKPDIKKFNP lENIKOKFKI IcrLI£I.IKS lUCI FCAALI LYITUCSICVSLI I ETAGVSPI 
ITA0IFKEIFYKA^rPSICrFFLIVAIU)C^nrORHNFAKELKMEKFEVK0EFKDTBCI^PEI 
KCRRROIAOEIAYEDSSS(^WiASTVVSNPKDIAVAIGYMPEKYKAPWXIAMCINUlWCR 
I U>EAEKYCI P IMRNVPLAHOLUJEGKEtJCFI PESTYEA IGEIU.YITSLNAONPMOWr 
NOPOHL 

CPn_0323 367322 369460 

IcrO- Low Calcium Response D 
• SFTMNKU/tFVSRTtjCCDTAIJM I^n<3SDLILA^>J^^MCVVLM 1 1 1 PLPPPrVDt^^ 
SISVFLLMVALYIPSALOLSVFPSLLLITTMFRLCINISSSROItXKAYACHVIOAPCDF 
VAro©firVVCFrrFLIITIIOFIV\nXCAERVAEVAARFP.LDAMPGK<»<AIDAIMJ^^ 
AT0ARDKRA0I0KESELYCAKDOAMKFIKCDVIACIVI3LINIVOGLTIGVAMHCM0LA0 
AAHVYTLL3 rCTCLVSOIPSLLIALTACrVTTRVSSDKNrNLCKEISTOLVKEPRAUXA 
CAATLCVCFFKGFPLWSFS I LAI, I FVAU: ItXLTKKSAAGKKOOGSCASTTVCAACOGAA 
TVCONPDDYSLTLPV'ILELCKOLSKLIOMKTKSCOSFVDDMIPKMROALYOOICIRrPCI 
HVRTOSPSLBCYDYMlLt^EVPYVRGKIPPHHVLTNEVEONLSRYNLPFrTYKPlAAGLPS 
AWVnEDAKAILEKAAZKYWTPC.r/riLHLrr/FFllKnSOEFLCIOEVRSMIEFMERSFPDL 
VKEVTRL t P WKLTE I FKRLVOE?D tS I KDlin- r LEJL.':EHA0TEKOTVLLTrrVRSSUCL 
Y t :;FKF30C0f:AX S\A*LLDrE I EEM t RCA t KOTIJAr I.S"^ LALOPa'^VNL ILKSMRIfflTPT 
PAt y KJf-PVLLTA I DVRRYVRKLI CTEFFD [ AV r ::YOr. I l.PE r R lOPLCR rOIF 

iTJ24 h/vxjthor IlmI protein 

VWAItftRMHAASGOTXX:L0triVr/tILAAVI-:A/JW\KADAAE'/VAr:0Er:::EM^ 

Ni*AAATirrKKKEEKFcaE:;RKKGEAf;KAi:KK:;Kjrri*EKr'r/rDrAOKYA:rj^t:;ElJ^^ 
itt;u'UA 1 1 :uaA.':pEOi ui.vuek i kdumjj: .tai j)YLvyrrpr::o( ;klk e*m- xoarktht 
a•^^;»<TAIt;AKNIu^va:EYAIX>u^/r:[^•v:l>;:f.Yl.Er^;m•lm:DQLL:^^LOonYTYOo 
ma rvn::FiJ<KCMATELKitoi^PT/r-::Af^i vvi /f rt:ruMf/^AVLT::YDYKG;nvp t llocuc 
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AF^ tyrril'Jt JJFVKVAE.T/'HK : rrJDKrPT/VIKV. AHL £C0DVD3VTT/1-N:.FF3ALR 
0TC3RLFr:CAI)KRQ0I>^ I ANALDA'/N IWEOYPKASOFPKPYI^a 

hypocner teal crcr.em 

KRIAH0WYK:.Lc:5LAPLLWTTL^POKN^^SCL:RFSI7rHVPVX5IEEDCNK^ 
CTTLPENVFRER : FK/\ALJVNC.':FQSS r KC : U;'«;EVT00LYLSD : L3M>« 
KLF.-JUJAK rWMEJLR-nNLPDEJr/U^ ITm 

mdiO-01ucjnocrjn:;cerase 

PSCFCNLLRRVNVUr/TKHSPSAHAWKLIGTSPKHG lYLPLFS IHTKNSCG IGETLDLI P 

LISWCQKQCrSVIQU^PUnTTCEOTSPYWSISSVAL^PLrLSLSSLPWIDTIPEV AKKLQ 

[^WELCSTPSVSYTOVKEKKWAFLRWOKCCKSSLEGNSNFSETLESERVWL 

AI KHHMHGEP INNWK3LTD0EIff'PDLTXXrHDEVLFFSYWnrYQGIXE\n<AyAW 

V^XKGOlJ'ILISKDSCDVVrtTKDYFS3SRS^rt;APPDLVNSBCX3W«LPIYNFSQLAK^ 

IWWKERUtYAQNTfSVVTUJJHt IGFFRLWIWDSSCRCRJ"IPDWPKDYIKOGTEIl-STO^ 

ASSMLPICEDLCI IPODVKTTLTHIJCICCTRIPRWERNWESOSAFI PUCI^^ 

THDSDTFAO^A^SPKEJUCQFAmJlU'FCICTr.rrETOIOIUa^HESASIFHrNIJNW 

LALCPOLVSKNLORERrKTPCTISKKlMSYRVRPSLEEtAIHWCfrKTflEKILTC^ 

CPn.0327 372927 373211 

rl29-L28 Ribosomal Procein 

RrHPJaMSRKCPLTCKRPRRGYSVTLRCIAKKKKCIGUC\m:KTKRRFFPNM^ 
EENRrLKLKISASALBHIDKLCLEiCVLERAlCSKNF 

CPn.0328 373220 374992 

CT085 hypochecical procein 

LKyREIFKSrLRR«ISLrRSQKOLXDVFAPVSPNLEL\ElHRRViroOGPAIXFHNVIGS 

SFPVLTNUT3TKHRVrX5U^SOAP[»n.rARVAHLISSTPKI^SI>OCSRDU^ 

ARFRRFPFVSMSSVfffJJHLPI^TSWPEaXATLnJ'LVVTESPTLTTPNLGMmC 

^r^MCLH^OIOKGC»^KLyEAfX^KKONLPVSV^LSG^^>FLTLSMAPLPEWSE 

OGAKIiYWCTTIDHPHPlJLVnAEriLVGESPACKWlPEXSPFGDHFOT 

lyHRKDAIYPATWGKPYOEDFYIGNKLOEyi^PLFPLVKPGVRRLKSVCESGF^^ 

WK£RYWRESLTTAUlILGEGOL5LTKrLMVrrOOE^^LDRrSVVtJTILERLOP^^ I 

FS ETANCTLOVTGPSLNKGSKC I FMGIGXAIRDLPHGYOGCKIHCVODIAPFCRGCLVLE 

TSLE33RCIKS1XHHPDLKSWPL I ItAIJ^UlCTIOSEKDFLWRTrrRC^ 

ATHRPtffNFPFVIDAlJiKPSYPK£VEVDPSTK0KVSE3U»mAYTPNKETrYI 

CPn,0329 375085 376146 

Phopholipase D Supcrfamily [leader (331 pepcidel 

KMNKROKDKUCICVIISTLILVGrFARAPRCrn'FICrFlJCSEEAIIYSNOCNEai^ 

AIEKADEEIFUima^EPXIQOSLTROAOAranCVTmOKrKIPOILKOASNVTLVEC 

PAGRXLMKOKALSIDKiOaAWljGSANyTOLSUUXtWLIUMSSELCDLII^ 

KDQTCKYFVLPODRJCIAIOAVLEKIQTAOKTIOVAMFALTHSEI lOALHQAKQRCIHVDt 

IIDRSHSKLTFKOUlOLNINKDFVSim'APCTLHHKrAVlCNKrLrJUSSINWS 

DESLIILEia.TKQQWKUtMIWKDlJUaiSEHPTVDOEElCXIIEKSL^^ 

CPa-0330 376930 376202 

CT083 hypothetical procein 

FlSIEMIII^IJ-SVLPSRFOOLHVYRFXESUaU^FhfrMTOEI^^ 

RKLPVRKRREKMYIJlIFRVLSRFDVKRriRrDPYCALSAOSIAXDSRONSPL^ 

ATNEAIRIJOiAIGDRiOEEKKORHRYKIJjCOKOAKVU^OIJWW 

D0Eia>£XNKQKR5IKVTKK}QCGISU;AAAS0AIAAAAEAWVIAi^GVI^A5T£^ 

EEA 

CPa-0331 378452 376701 

CT082 hypochetical procein 

IQRI IMAVSGGGGVOPSSDPCKWNPALOGEOAEGPSPLKES I FSCTKQASSAAKOESLVR 

SGSTGMYATESO INKAKYRKAODRSSTS PKSKUCGTFSKKRASVOGFMSGFGSRASFVSA 

KRASDSGBCTSLLPTEMDVAUCKCNRISPQlOGFriJlASQOCSSSDISOLStXALKSSA 

FSGARSLSLSSSESSSVASFCSFQKAIEPMSEEXVNAWTVARLOCEMVSSLLDPNVrrSS 

LVRRAMATCNECMIDLSDUXJEEVSTA^frSPRAVEGKVKVSSSDSPEANPTCIPNS^7rLE 

PAEKEAEKQESREOl^EDOMMIJUlWlAGriTGAAPQEVI^SWSGPST^ 

PT0RSGDKSKHKSPGIEKST^mTNFSPUlEr7rVKSAEVKSLPHPESMYRFPKDSIVSREE 

PEAWKESTArKNPENSSONFLPIAVESVFPKESaTGGALGSDAVSSSYHFLAORCVSl.L 

APLPRATDDYKEKLEAHKGPGGPPDPLI YOYRNVAVEPP IVLRSPQ PFSGSSRLSVQGKP 

EAASVHDDGGCGNSCGFSGOORRGSSGQKASROEKKCKKLSTOI 

CPn_0332 378676 373536 

CHLTR T2 Procein 

YIXSRI RVl PLARORCrMLLAVLC PPISFFTOGVSPCVFFCFLDF 

CPn_0333 379117 379800 

UuB 

VDFFVFVFFMGKPKKSRTDRALAOEIOKKSTEVLKKPARIKAKNRRKFLIAKEOKTLKHR 
A0EYD0L-/R3La)20KKDTDKVL r FNYE^JGFVFTDKDHFSKYS I RL 

CPn_0334 37!>30S 379823 

CT079 -imilaricy 

TMSVH ITPRKCF I LC I LSMFTLPTLFPKAHLILFSPYmr FYCFSKDKCLVLALCCCVL 
-DLALGSRGVFLLLYPLTALITHKAHt.IFSKESKAALVrVNMI FYCVFLLLT I PMCALFC 
HEVRWC I DVLMI PLKCSFLDNLI FTSVIY I LPCAINSC I HKM ISFFRRLVCY 

TPn.OJ^S 379R0P 380674 

talD-Mf:cnylene Tec rahydro Colic e Dertydrogenase 

EICMU.P'JI PAAEKILCRLKEEISOSPTSPCLAWLICNDPASEVYVCMKVKKATEIGr : 

.'JKAHKLPCDSTLSGVLKL f ERLNQDPS I HG I LVOLPLPKHLDSEV I LQA I SPOKDVOGLH 

PVfWCKLLLGNFDCLLFCTPAC I lELLNYYEI PLRCPHAAIVCRSN IVGKPLAALMMOKH 

I VTNCr/IVU UOSENLPE I LKTAD [ t lAALCAPLF t KEThWAPHAV I VDVCTTO 

AKf;\TLi/:DvnFriNv\TKt:.urTrvfKxnff:pKrvAMU«^^ 

'T-njUtt. ^Hn5^*» is 15^1 

K:UKMi':;(ni::::::wi/KW::in*:Rv,MUX>VMAMLPKFFL-/LU:LCU:::i;a'»KTT^ 
rYH[Vi/rr::t.:AKEK/\::i-:wiDRi;FHKto:;tYfn*ftiPY.^EL,':riNRAPADVPiTL:;vEL 

::k|.-| XX/Jl/n.yK L:;iX:RKnt TVCPr.KTLWLU l LK.'IOTLP PKOVWI^OHYK DMr>-W0HLEFO:I 
rn-KTt, [ KKNI IIV'^ { DU\:SA^''YAVDCLNE tL*WFr;pNNYVEW:x :e CKT:X:iiMP 

t k;:kw\/V rr iuuhh ma t at:?*m»i loKwvnriK tYTiaLC/rrm:Ki'!.ELH:;YPto:WSVVH 
*AYAt/A t ATVI Kn*r\:K [ FAK(.^AEaill clty rticoAnri 



CPn^OJi"? - I 3H1575 

smpB- Small Protein a 

[EEIFPCNOCKRIL-I IUL»RI<NCFLJ;.Wr.L3PtKMiaJ<A0KEXWS5mKALai«CW 

LHRYEUUCI^K lAQKCafTL I PU><Fl^RCYVrVRUXr RCKKAYDKHRTI lEROCEI^ 
AAAMKRRHH 

CPn.033P 3='2272 383375 

KNMKr.--.-f?Nr". - —"Xfrrrrr ■".::! v:.::ttvt:: !"A"T^TtHrvr:r- "vrK 

AK\,-YEKCAXJ I PJKRKFwLVKELTEANLt: U JJAJtHrV; liJJJ JCr KLU;MEK£DFFM;, 
PDrONALRFSLPAEOUCTMtCRTSFAVSREESRY\a.TG'AJAIANCVATIVCTDaCRl-^ 
IDAEVTLOKSFSGEY 1 1 P IKAVEEI I KMCSDECEAAI FU»DKrAVECOrrriJ.ITKII^G 
EFPOFSPVr STESNVKLOUIREELITLIJCOVALTOESSHSVKFSFLPCELTLTANCTICV 
CECacVSMAVNYSGEliEIAFNPFFFLOILKHSKDELVSLGISDSYNPGIITOSASCLrVt 
KPKRLKOD 

CPn_0339 383405 384034 

CT339 hypothetical protein 

vrriJW>ocicsuaja<FRNHSDixiSLAPKurfA(xaaTnxE^ 

OTrrrGSSHFFLETOraaJHLi^JAl^ irrOKCKKKICYNOLP IiaLSOLICKWIVLre 

KORIXISGAPADRRI^UJIXI^OCCtOm'I^LSYYHRALCX^RNAUJCSKgTm 

WSm'APTyPSNGFS^A/R^^FQIYPICNFGLTT 

CPru0340 383842 384156 

(frame-shift with 0339) 

PLYPLLIVLSSRSSAEKCSLICKOANLNRCLWDEOLVKHGTYLS lORfLCSQKLSDLSKEL 
W5N»aJC£0ZJUJCFKS5L2KN5DISETAVA£EFHKOLSISLPRDL£ 

CPru0341 384160 384495 

(frame-Shift wich 03401 

GSTSWPHREDFU.T^OWPVS0FSSBO0KHSLIAILRJLAECLYLK0SHHVSPLVCLDDI 
KACIIKERVCOliDPAPTLGQTLITSTHHHCELPJCTSLVLSIENAOVSEOI I 

CPn_0342 384619 38S062 

predicted 0«P (leader (19) peptide] 

HMKKFlXTIli^AVQIPtXS^rS^^C^^PSGIOCUCETSK0KESVVCVH^ 

IAKVIXKEKVDVFIWXYCTRKrri£KHAEHUJRIXiaCIA£^ 

VALAMPDCPEEAKKEKLFSWLLRTOCLK 

CPIC0343 384999 385595 

(fraae-shiCt with 0342?) 

LPRRSQXRKAIlilAPPNACSTUUU^YRCVKFVOrmxnC^ 
SLZJVLILSGNRHSKFLPFRLPYENIXSKVCTIETmTrPHKAYVtHTSJnYir^^ 
MKEFlJC£r»rrTPIIEHVPEAAI37rVMEDK0KNSRUCinfIWDIYVIHCrC» 
PKXNSLNQKNEZNPEXLEK 

CPru0344 387432 385558 

yaeL-Metalloprotease 

SSRVKTIIYTItJUUJ^ILVLrHElXaaNA/AKAVtMAVESFSICFGPALFXKRIGC 

RZCX:IPFXX2YVRIKGKERTK£)CGEKGKXOSVYOIPCX:FFSK5PWKRZLVLVACPL^ 

AVIAFSILYMNGGRSKWYSDCSKVVCWVWPVLOABCIXPCDEILT^^ 

SU£Catt/0:iKRPCrfLTW»SKEFAII7\^rDPTKrG\m:SCASYIiYSNQW^ 

NSEIJ«t©RFVWKDGTLLJ'SMAOIS0ILNESYAFVrtCVARNDKrFFSR0PRVIJ^^ 

YlJtm.I0T0yEMXXCKW5SLYTLPYVINSYCYra;ELTAIDPESPLP0F0ERLQLC^ 

ILAXCXrrPVSCSVDIIJU.VONHRV'SIIVQOHSPQCLEEVmSRIMDKRFIASYHSCD^ 

LNKLGESKPVEVWPYTUXDPVQPRPWZDVYSSESLSXQLEVAKKIKNKOfCORyY^^ 

AEK0KPSLCISUa)UCVRYKPSPVVML5NtTKESLITLKALVTGHLSP0«t^ 

UrrGWSVCrSE\rtJT«CLISMNIJ^VU^LU>IPVLIX3CYIL^ 

LVPFTTLLI IFFIFLTFOOLFRFFG 

CPIU0345 388587 387436 

CT34 5 hypochetical protein 

UCVACIiCHLAVLCSTCSIGRaTl^rVRRYPSEFKI r SMASYCNNIJU^FFOT 

A\rrtIE3VYNEACORFPHM0FFLC0EGLT0U:iMmVTTVVAASSGI 

ALALANXEILVCACELVSKTAmCIKVLPIDSEKNALYOCLQGRTIEGIXKLILTASOG 

PLLNKSLEELSCVTKODVLNHP IW^MGSKV^VDSSTLVNKCLEtIEAYWLFCLEl^VEILA 

VIHPOSLIHGMVEFUXJSVISIMWPDMLFPIOYALTAPERFASPRDOlOFSKKaTLErF 

PVDEERFPSIRlJWOVtXKOGSSCSFFNAANEVLVRRFLCEEISVCDIUlKLTrL^^ 

VYACKSLEDILEVDGEARALAOEI 

CPn_0346 389690 388704 

070-troD/ytgD- Integral Membrane Protein 
KKGSLMALCPSPYYCVSFFOFFSVFFSRLFSGSLTPCSLYIDDIOIIVFLAISCSCAFAC 
TFLVWKMAMYANAVSHTtT.FGLVCVCLFTHOLTrLSLGTLTlJUVMATAMLTCFL^ 
NTFICVSEESSTALVFSLLFSl^LVtXVrtVrKNAHICn'ELVLGNADSLTKEDIFPVriVXL 
ANAVITI FAFRSLVCSSFDSVFASSLG I P I RLVDYLI IFOLSACLVGAFKAVCVLKALAF 
LI IPSLIAKVIAKSIRSLMAWSLVFS IGTAFLAPASSRAILSAYOtCLSTSCISWrLTM 
M-^ tWKF ISYFRCYFSKNFEK ISEKSSQY 

CPn.0347 3?1078 389S78 

069-troC/ycgC-Incearal Membrane Protein 
TFCTOPEALSRKTIWIVLIMI^CVFSDTI FL3SFLAVTLICMTTALWGT r LLISKQPLLS 
ESLSHAS-/PCLLVCALMAO^VFSL0A5 1 FWIVLFCCAASVFCYG I IVFLCKVCKLHKDSA 
LCFVLVVFFAICVILASYXTCESSPTLYNR INAYLYGOAATLCFLEATLAAIVFCASLFAL 
WWWYROr/VTTFDKDFAVTCCLKTVLYEALSL I F ISLVIVSCVRSVC IVLISAMPVAPSL 
CAROLSDRLSTI LILSAFFGG ISGALCSY X SVAFTCRA I rCQOAVPVTLPTCPLWICAG 
LU«:triiF3PICiX>WIRr^TlRKHFCFSKD0£HLLKVFWHISHNRLENESVRDFVCSYKY 
Or/PCPKPFPRWR\«5It^>fRCWXKEODY-niLTKKCR.';EALRt.VRAHRLWESYLVNSU)F 
aKEGVHELAEEX Ein/LTEELDllTLTE t LNOPC^DPIIPO II PNKKKEV 

CPn_034S( J-i 1^(15 I01OZ7 

O'iB-troB/ytuB-AOC cccjnspactHr ATP^se 
FrwLU^VKDETFW.WHNu:\WEHAAV|.Yl^I:;Fr:LCKr;cLTAt^/:PN^^K^T 

LI KPCSfTT/'^FFNOKFKKVROP CAWPORA.'r/rjWnFPtfn/I.DLAIJf X.-YUYKt-WWCR IZS 
DDftREAFniLERVCLELVArROf'»L;;'y:0QOPAKLAI<ALMOKAOLyU4DELF:;AtDMAb- 
FrrSVfr/LOEU* CCGCT I\-\'VHHDL::nVROt.FDHWLLNKRL tCCGITOEX.IlJUDT I FOT 

rj:E I ELLEOTLKLSRGKor :c'; 

*;f'n_o»4'/ ^iV!??/^ i-t|7'»o 

'(»,V-triA/vraA-::rjUic.f iTor.nn Htwlirift K.iinily 
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WtLKH/VIREMDAKXrriFKVMRWrFTFVACGITl .JSCFONANSRPCILSHNRMIHDC 
VERVVTNHL/VrAVL rKC3LDPHAYE><\ntGDKDK r ACSAV r FChCLCLEHTLSUlKHLENN 
PNUVKU:EBL [ ARCy^FVPLEOX; ICDPH I WKDL5 1 WKEAVr E ITEVLI EKFPEWSAEFKA 
^^"EELVCEMSrLD:WAKOCt^IPENLRYtVr;GHNAFSYFTRRYLATPEEVASGAWRSRC 
tG PECL3PEA0 [ SVP D IMAWOY I MEHDVSWFPErrTLNOQAUCK I VSSUCKSHLVRLAO 
KPLYGDNVTIDN-tT.TrFKHNVCLITEELCCVALECOR 

\- ".it v'-.-z. • t.y^ • vr.-/. - KfinLC^v: .r- V :r:vMK)n t:.kkmk.:;v\; 

tCLLKMUCTFILEScTPftTINPKPhAPRGSKKWUJfrNFTKTDIERVL^^ 
AHFS PKKPLTSLKRELIRS IRNC IVSVELWNAYVEAVKAVSS PNLEVTSPFV 

CPn.0351 39)861 395432 

adc-ADP/ATP Translocase 

KIKVrORVNmTaEEKPFGKlJlSFtJWPIHTHELKKSrtJ>MFl^FCIT^^ 

tVCAPGSGAEAI PFIKFVLV^TO^I I FMLIYAKI^lLSKOAUnf AVCTPFLI FTALFPT 

yiYPIJlDVLHPTEFADRLOAILPPCLLCLVArUWrFAAFYVI.A£IJWSWLSU^^ 

ANEITKIHEAKRFYALFCIGANISLLASCRAIWASKUIASVSECVDPWCISUUXMAMT 

IVSCLVLMASVWWINKNVLTOPRFYNPEEMOKGKKCyUCPWCIMKDSrLYLA^ 

LL\^AYGICINLIEVTWKSOUCI^PN>^^roySEF^CN^S^WTGWSVLI 

FCWLTCALVTPVMVLLTGrV^FALVIFR^roASGLVAMFCTTPLMLAVVVCAIONILSKST 

KYALFDSTKEIWYI PLDQEOKVKGKAAI DWAARFGKSGGALIOOCU-VICCSIGAKrPY 

UAVILLFIIAIWLVSATKLNKLFLAQSALKEOEVAOEOSAPASS 

CPn_0352 39547B 396830 

No robusc' homo log present in Genebank/EKBL as of 11/7/98 
WVCIFFINSHFTNSYAFFNOKVIITVRHSCCTMKCSPLTLVPHIFU<NtX:EEHRSCSUtI 
RT IARLILGLVIJa.VSAl^ri^LAAPISYAIGCTIJVIJU^IVILI ITLWAlI.AKSKVt^ 
PNELOK I mmYPKEVFYTTKTTiSLTVNELKIF I^O^CSGTDLPP^^J{KKA£AFG IDILK 
S IOLTLFPEFEEIlXO^rcPLYWt^HFIDKTESVAGEIGU^lCTOICVYGLLGPLA^^KCYTT 
I FHSYTRPIiTLI SESOYKFLYSKASKNOWDSPSVXJOCEEI FXELPHNMIFRKDVW 
OFLFIJFSHGrTWEOAOMIOLINPDNWKMLCOFDKAGGHCSKATFroFLNTE^^ 
SWEPTVWFfmWELKVIXEKVKESPMHPASALVOKICVrrnTWOr^ 
wrSSLPOYAFHAOTYKLEKKl ESSLPIRSSL 

C?n_0353 396893 397135 

No robusc homolog presenc in Genebank/EMBL as of 11/7/98 
UinWIKKSLIFIKRI RYSQSGKEOKGARPFFKKS ITSSLVirXLEAIFNENTSSI lONN 
rNKNFKNKN r S INR I FVKFT I 

CPn_0354 397062 398507 

No robusc homolog present in Genebank/QfflL as oC 11/7/98 

YKTrSIKIUCtKT^IXIGFU/^lJ^YNTOIDEPRKCMSNITSPVIONNRSCNYYFELK^ 

7rHmSAILLCGALrAri^AAPVSYIT,y:;ALLGLGLLrALIGVILCIXKrTP><ISSKE 

QVrPQELVNRIRAHYPKTVSDFVSEJOCPNIJaSLrSFIDrXNOLHSEMJSSTNY^ 

OKIinTBCIARLKNEVRTASLKRI^AASSRPLFTSU»KIWKVFPFrWlZ;EFISAC^ 

VEUtRVKKIOCSIXmt^DYIKPEMLPTYVre-lPLDFRPTNSSILmjm^Vl^ 

OHLKYAAUCEWNUmSDUm«OOLrAKYHAAYOSYKHI^OPSLOn)EFYNIJXCIFKH 

RYSWKQMSLIKTVPADtWENUrCLTLOHTCRPOCHEJ^ASLIGTLYTOCLIHKESEAFLSS 

LTU.SU3QFKTIRRQSTNIAMrLE2JIJmWSTFllSU»PITVHPUCRSVFSOPEED£SS^ 

IG 

CPri_035S 399955 398591 

No robust homolog present in Genebank/QIBL as of 11/7/98 

IRDrnjIIIYTAFNRSISmJVMSWrrvPHALFKNHCECHSTFPLSSRTIVRIAIASLFC 

IGAIJUUjOCIAPPVSYrVGSVLAFIAFVILSLVILALIFGEKKLPPTPRIIPDRrrHVID 

LEJtNNWIFEIlLl^OTCPLYVn-OKriSAGDPQVCRDLGVPRECYCYYWUSPU^ 

I rCKETHHILOQLTKEDVLLLKNKAWEKWOTDEVKAIV^ mTYTARGTLKTEACCLT 

KETISKEIitl^UiGYSFDOLOLITQLPRnAWWtCFVDNSTAYNWir^ 

U)ESSIDFOVNLGLYVI0DLKEAVOAFSASDEPKKELCKFlXJUiLSSVSKRLESVIJ^ 

KRIALEHCNARARVYDVNFVTCARIHRKTSI FFKD 

CPn_0356 400465 400109 

No robust homolog present in Cenetoank/EMBL as of 11/7/98 

KQVOLFOYMNESQ'JWLCDFDSOGECFOLSRLVGLLHSSWALYEAKEOFYLPEVSU.rWE 

ELIEMOUiSKPTKHGVAKDLCNVFEKHFORFROYLGSLDLNQRFENTFLNYPKYHLDRE 

CPn.0357 401341 400469 

No robust homolog present in Genebank/EMBL as of 11/7/98 
YSSHNCASMVNrOPVYRNTOVNYSOATOFSVCOPALSLirVSWAAVLAIVALVCSOSLL 
J I ELCTALVLVSLI LF A3 AMFM I YKMRQEPKELLI PKKIMELIOEHYPS IWDF IRDQEV 
3 1'^IHHLI 3 ILNKTNVFDKAPVYUJEKLLOFGI EKFKDVHPSKLPNFEEILLOHCPLHW 
U:RL\r/PMVSDVTPCTYCYYWCGPLCLYENAPSLFERRSLU.UaCISFCEFALLEDCLKK 
NTWSSSEL'/OIRONLFTRYYADKEEVOEAELNAOYEQFDSLLHLIFSHKLS 

CPn_0353 401757 401578 

Mo rob'isc homolog present in Genebank/EMBL as of 11/7/98 
ED/L^SMKLIPTODSIERETDSKRDKKIFTIYICSSKVLAGHFFSHLDKHNKIHESIGV 

CPn_0555 401994 403817 

lepA-GTP^se 

ITLOY I LKEYK I EN I RNFS 1 1 AH I DHGKSTI ADRLLESTSTVEEREMREOLLDSKDLERE 
RG IT I K^PVmTYLY ECEVYQI-NL I DTPCHVDFSYE'/SRSLSACECALLI VDAAOCVQA 
OS tAN\rrU\LERDLEX : PVLNK r OLPAADPW r AOT lEDY IGLDTTN 1 1 ACSAKTCOG I P 
A I LKAI IDLVPPPKAPAETELKALVFDSHYDPYVCIMVYVRI ISCELKKGDRITFMAAKG 
nr;Fr/l/: IGAFLPKATFIEGSLRrCQVGF FIANUCK^/KDVK rGmvrKTKHPAKTPLECF 
K E tNPV/FAG I YP I DS3DFDrrLKDALCRU)LNDSALT I EOESSHSEjCFCFRCCFLGLLHL 
KI IFF:ftI tREFDU)IIATAPr:vtVKVVLKNCKVLDIDt^PSCYPDPAIIEHVEEPWVHVNI 
(T rOET^LT^i IMNLCLDKRC tf .VKTEMLDQHRLVLAY ELPLNE I VSDFNDKUCSVTKCYCS 
KnYRLf;[>Y(>KG:; t IKLL-VLINEEPrDAFGCLVHRDKAESRGRSrCEKLVOVtPOOt-FKIP 

; OAA r r ikkv t ar et t rai Krnn'AKCYGGD r tr krklwekqkkckkrmkefckvs r 

'•(•Mjl I'.'l tllMt.-i AiivrS2 

'TiMi tiyritti ri»'r ti.M I pror.Mii 

V/vt Vni I'M. tt ;UVVMt:KNLVl/4M I DHCF5;v.'nriTJRTPEKTR0FLKEYPNimEL'/f;FE.';LE 
I.KVM;:i.Kr<I*ltK IMlWIt'AiIKrvCXV: n(ALLPFt.Ef<;n.Vt tCf/INSYFKOCERP.CKELOEK 

. : I u- 1 / : I : :i X ;eii;.\ri h ; r*:; cm nTiNPFJvwPLVAF- r fo:; i aakvcx^r prcr,-wvrrrof y^c 
( lYVKAviirr : r rvf ;nioi . ii.fjw'I i i.roklkljatavat t lkewntlelesyl i r iagevl 



alkdpecipv:2t:i:w:^ jKWTAroALNj:r-T*-.::.: :~Av-«ARF-_:.ntfKErR£CA 

ARNYPCTPL: FEMPHDPSVFIODVFHALYAGK : raYAOCFM'^L:;£ASKri^WGLDLGElA 
LMW«X:CI rOSAFLSVIHKOFAA«5£JfWjLrFOEYfW3ALR»*^^ 
I PCLAAArTFYDr:YFrrASSSMStJ^OC£W)*rCAHTYER>tofl PRQ^^ 
K 

CPn_036l 40'i650 405382 

cyrS-tyrosyl '.P-^ Synthetase 

r^HLFKNYCTrLCCGGOTWGNrrSGIDFrRRKCLCWAYCLrrPU-TNAOG^ 
TVWLDSOLTSPFELYC YLLRLPOOr I PK I ARTLTLLSNEEIODI DRRVQTDPVAVKEfVA 
Q0Il^IKCDLCL££ALSVTRSHHPGNl^SLSEKDFHELFACOCASLDKSEVLCKRWU3 
[J-CVLCLCKSKCEIRRLIEOKCVYINNWIAOTMSVCEEQDICYCHYVUJVOaaC^ 
YLN 

CPn_0362 407B43 407055 

f llA/rpsO- Sigma -28 /WhlG Family 

r^IaCKrVKTQQTQNIIEVW^^n^rfET0EIEYROSLIEF^XPLVKSVV^ 
DI.YASCVEZ3LVRAVERYNPERSRRFECYAVFLIKAAI ICDLRKCDWVPRSVHOKANKLSC 
AMO5IJl0SLGXEFT0IXU:CYLNISQ0£L5GWFVSARPALIVSU;£EWPS050&C»GH^ 
EERI POERA£TCYt7VVDK0ErSLCLANAIOELEEKERKVMALYYYEELVlJCZICK\rt^ 
E5RV5Q IHSKALLKLRAALSAFR 

CPn_0363 409700 407943 

ClhA-Flagellar Secretion Protein 

EAVrVSCKKIKVRGKirVTLSILVLIFLPLPOIUDFGLCISrALStiTVCWVrr^ 

SAKI^PPFFLYirLLRLCLNLASTRWWSSGTASSLWSU5SFFSLCSU<AATFAC^^ 

FVrTTLHVSKGSERIACVRSRFFLEALPAKQMALOSDLVSGRASYKAVXXOXNALIEECTr 

FSAKECVHTlFVXCDAIISCILIXVNWSVTCLYYTSCYAt^MW^^ 

TSCAAATLISKIDKEESIXNYLFEYYKOLROHFRWSLLIFSLCCIPSSPKFPIVLLASL 

LWLAYRKEEPASEDSC lERAFSYVBCACPKBOESQFYQVYRAASEEVFEDLGVRLPVLTS 

LRIEERPWUlVrOOWVl^HJfrPEAVLPFLRNIAHEAUUEVVOKYIXESERVFC 

rVPXKXSL5SLVVt5RII.VRERV5LK£JTKIt£AVAW0N5GDSIXIIA£l^^ 

GRSIW3QK0TI^ITIDFHVEELINSSYS1CSNP\^ENVIRRVDSLL£RSVFXDFRAIVT 

SCETRFQfiOQlLDPHFPDIXVLSKDELPKEZPISFLGIVSDEVLVP 

CPa.0364 409954 410238 

ferl-Ferredoxin IV 

KENSKAKLVrrSDDroOEFEIXCyJSEIAEFCESMG : PFACTEGVCGTCVICTLECaiQILS 
EFTEPEVDFLGEPEDSNERUUXCRIKOCXVKVTF 

CPru0365 410498 411544 

No robust homolog present in Genebamk/EMBL as ot 11/7/98 
nCCTTOVNSLIMATISPISLTVDHPLVITrKKKSCSNFDKIOSRILLITAIFAVLVTlGTLL 
IGUI2lIPVrm.TPISFIAVVLSNFILYKRATriXKPRACGKHKEIICPIW 
. ISZAZNRSKENWEK0PKDLQNLPAP5AII.TO»fPYEXWKAKHSlJ*5LV5LLPGCNP£HLLI 
5A5£^^JCXTIXICErSGNAPI5SY^^3TTPSPKSLI^£AIQETRVE^^^'£LPAGOSGE!^ 
MOPDFRGRVFLPOIPTTPEAIYOYYYALYVTYIOTAINTOTQ I lOI PLYSLREKLYSREL 
PPOSRJfQQSLAHITAVKYKAELHPEYPLTIACVERSLAOLPOESZEDLS 

CPR.0366 41X976 412440 

No robust hoaolog presenc in C«nebank/EKaL as of 11/7/98 
MCnfLPVSATOVLFESPAAPUNSANTONDKLIElJCCKOOAESSPRTITSVIlXVLLVICC 
CLlVl^LLAIRPAWFTtXrGHPAAIAVLAVSCTILLVAVIIIJrFUUVVPFAAXI^^ 
VKIVDDYASWHSHQQTPTIjCTIFSCIVYAESOAOL 

CPn«0367 413078 413836 

NO robust homolog present in Genebank/EMBL as of 11/7/98 

SFPU«lYFKrKrrSIPCVKEJOSHLSVDERLISESP\^TKKEVIAKIXKLTAULAIAIA 

VCTAWACVLGMPLMAIATCAAUAAVVLSCUiRRREPSKPTEELUSPOKH^ 

VQPSVPLDYOKUJWEWTLVWLSEINISVa'LODPNORYYVWEHCXIAPITLVATrCD 

PRIJCriSCRVMIVNAANSlWSGGACTNAAI^AATHPrCWNNTRTSGGKINTGK^^ 

RSAFWINRDMTNX 

CPn_0368 413766 414107 

No robust homolog present in Genebank/EMBL as of 11/7/98 
TLAKDYl^WNAAOHPGSIETCRINOTOPGEAHFIAOLLGPKYEGEUCAHPEKL^ 
YI^FDEALNNOArWOVPLISSSIYSPCGKLELEPVNOTKPNSSAYKLYHIRT 

CPn_0369 414345 415562 

CT058 hypothetical protein_2 

NlfTTDSNPLPSYTeASLYRTPAKHSYP IRLPt^TDRIEK tUC r\rrLTlJUJWyVLCFSI A 
AC lU^MP I FSAVWITLAIAAVSLYSLLKKPKLYeiLPOIEPESEQSSLSPSPOPPEOOD 
LPUJIDPLPDPES'wFEVSLADLTTPPEELTAITVTPGYEALLEONWOLLPSEJUVVDPSFT 
TETPOOPCF IWKLKD3KLI FISTSCDI AVPR I KTOCRVK I VNAANENISREC30CTWKALS 
LATSLOC>VNASRLPRAHSR:9CSOLOPGECRSAKWE3fSDHTSNDHVPGKAHFLAOL£X3PEA 
AKCNNDPK0AFEV3KKAFHNLF0EAEI ICVDV IQLPL IGCNLFAPSRLLNU3KTRAEWI E 
AIKLALZTSLODFOWEOONeEEOKII ILTDKCQPPI I PPRFDLTTP 

CPn_0370 415755 416912 

CT058 hypothetical protein_3 

KR IFFKLFVFYLK5FMSTTEPNLTNVNLTML I3SESMPT0LASHKLKCLDLVAF ILl ICI 
AVSSGTAAI lu: rPLLFILTALAVLAFSILLYFLLREPKSPISVTHOPTPI IKDTDLPPV 
PPLALTPVPTEA\XEEPPLPSPRTHOTtXOEfiWDRIPDLOANTDMPFIAADN0TCYAWHL 
KNSNLTL ISTljCr lEKPRVKTCC r VM I VNAATPrWANNVKGTSIJVLAKATSVRCWENSKK 
SP0PUlSK0PU:u:ECR3AKVENUr?TTNAOKAOLP0FLC0LU:PKASDYNYNPNDAFTF 
CROAYLNC£^EAKRRKTmVLPLLGGHFPCGPKDEETTSUlU>* r KTHvLAL I DALQ^ 
GSEAENONOPWV r : LTTLARHPHTP 

CPn_037l 417UI 417^0: 

No robust nom-Uoti ptitstfnr. m < ;»:ri':r*ink/FJ<1DL .u; »it 11/7 /in 

KTMPV5.':APLPT>:iiRp*;:*»f:Nu;LMEr'ti.';KAr.KAKiionKrpKTrKi.i.vK ILVA r lvccvu: 

1 1 AAFF r PfTTPP Iv L I lUWUlUTT/U Vl.l J.V f KL.\t,VriKTR7rrAhXVi KRKU:CK3 1 



t:pn.0»72 41 't.*;! 4 1 MUM 

No robur.r htwtk*UKj i>u*t:<mf. irt '^ iHrt.ink/l-Xlit. .i:-. oi U/7/'»H 

MY RACHP^ It •MHH.<:> PWTXTri^.'rArjP/l VPK I / ;I:F[ .KHIJ/:: U ;m ; t K t AFAA.-rrAMXLN 
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CPn.OJ?) 4lRJS'i 4202 IR 

NSE IFEI FKTL ITPAI NSSPRKTHT^m ICNLY rCSDHS r !aX5S>frTTLTTDI DSTV^ 
AI>£MrcD r VRVT70C IKEAOACEKI KERLI AbSLN I PLVA0 IHFFPOAAMLVADFADKV 
R INPCW IDKRNMFKGTXIYTEASYAOSLUlLEEKFAPLVnCCKRUIKAMRICVNHCSLS 
ERrM0Kyctr'IECMVA3Arr/IAVCEKLNVT?O\A^SKKSSNPKri^AYROLAia3UlAR^ 
WLr PLHI^^EAfTMpVDT.I rKSAVC rCTLIAECLCDTIRCSLTCCPTTE I PVCDSLLRKT 

VI,- A --.-/! : y PA t • ! Tf^v : > FMFi .vrnHnovi '.'iv IN twr;~r.-* 

/HCAP^VH FHASDPF I HTSRDFKEKCXJHOOKPTKLV FSRDFDNKEEAA: J XATEfCALI^ 
KLCCAVVLDLPNLPLODVUCIAFCTUJNACVRLVKTEYrSCPMCCRTLFDLEEVTTRIR 
KRTOHLPCUCIAIHGCIVWPCEMADAOFCrTOSKTGMrDLYVKHTCVKAHIPMEDAEEE 
LIRtXOEHGVWKDPECTKLTV 

CPn,0374 420209 420961 

CT056 hypochecicai procain 

VDS^f^LSFHTHPUmf^FEE^a:LPIRHC^^KOKnA£Xr^yFAAKNPEIASAL0^ 

UiORKGTSVRCVTPTSPTVQPArcirTOSPIiSUlIRHSDCOAAirmiEHHAIANW 

WRCIIJiCair YAVTVC7TOKKLFH7KP0D1JVA ICPS rCPDYAI yPDYATLFPR^ 

NH FDLRAIARKOLTNLC:SK0R IFISDLCTYTEHDArFSSRYtAHHPDPNLTGOHSKNRN 

^^VTAVLLL?RD 

CPn.0375 421112 421615 

No robust r\ontolo9 presenc in Genebonk/DIBL as of 11/7/98 
RLSMKLGASrrmKVHEPVKPKKAOUVErEANKTQATECTLRSKSLALOIARAVLYILFAA ' 
LMLAAG ITFVTFLALGFPL lOAYS I AGI ITLVCLAICLVLLI LSLLPKEDEEADALSRNA 
LLPLTIIVIEOOPITPKPEI PYSYLTKLALLTSLFLTLRRSSSQRKTH 

CPn.037fi 421680 422294 

No robusc homolog presenc in Cenebank/EKBL as of 11/7/98 

FKVVTAKAPNLTEIRDKGARVPSIJT^PETSHWKGDKZVSAPLKOLODLLCEEOWEAM^ 

TKMNSRXKAGOWAIFNSPTPCN^STLVLAWTPWCYYDKIWODILER^ 

EFIJ<NLFVDLr.ENGFTSVHIHAEEArrPLDHTGKPHFKRDN\m*P^ 

VSADTOFTLFLTODECNPFHDKKRC 

CPn_0377 423441 422347 

s ucB-D ihydrolipo amide Succinyl transferase 

IhrrmmiPNIAESISEVTVASLLVTECALlOaJOGIXErESDKVr^JLIYAPVSGRIFWE 
VSBCTVVPVCCVVGKIEPAGBCEEIXIDSOSKETIEAEIICFPOSGVROSP 
OQKIXWSOGt^ACDRCETRS^frSIRJa'ISRRU.SAIJiESAHLTTFNEV^ 
KOEEFI^RYGVKU^FKSFTVKAVLEAUCAYPRVNAYIDGEEIVYWrrroiSIATO 

?P0TOIUa^HKIEKRPWUl^^EmADM^fYVALSYDHRLIa3KEAVGFLVK^^ 
SLLDL 

CPn_0378 42619S 42344S 

sucA-Oxoglucarace Dehydrogenase 

IVFIEFNYFKDSETVCOVYSSDMrwiESMYQRfMNHCTLDPSWKYFFECYOOT 

ASTXISGNErriAMLOEQKSOFLCTIYRYYGYLQSOISTLAPTTDSRFIOEKrAKIOLDEQ 

VPSACLLPKAOVSVRELIEAIJacCYCGSLTLErLTCTPELOEFVWN^^ 

IJlSYia3l£KATFrEEFMIKFT(XKRFSLE0GErLVPML£HLVHyGSAI^ISWVL^^ 

RGRUJVLTNVLCmRYVFMEFEDDPAARGLESVGDVKYHKGYVLKSHOra 

NASHLESVDPIVBCWVAALOHOCTACKEOSStJVrLWCDAAFSGOC^ 

STEi;TI^IVVNWrGrTAVPRESRSTPYCTDIAraiLCIPVFRVNSErWACIM 

VmERFSCDVriDirCYRJCYGHNESDDPSVTAPU.YDOIKWCKSIRELFROYUOT 

SEEnj^I£3aiOESLNRErOVUCTT3PEPFPKKECHHCDRIieCELIUnx^ 

LFKMSSRLCGrPDtnTiPHPKIKTLLEKRMKMAEGGVCYDWAMAEXIJU^ 

5??2L^^^^^^^^°'^^'™^SPLYHLSAEOCSVEMYNSPLSEY^ 

OALm-VU^FCDFANGAOIIFIX}YISS^ 

lERYWIJVANWFOVVLPSTPVOYFRItJlEHAKRDI^LPLVIFTPmJA 
FTEPGGFRA I LEDADPNYDAS ILVLCSGKIYYDYAEMLPQDRRKDFSCLR I ESLYPLALE 
DLVSLXDIOfSHUCHrWLOEESKmSAYDYMFMALODILPEKLLYICRPRSSSTASCSAK 
LSROELVTCMrrLFSLR 

CPn_0379 426268 42676S 

CT053 hypochecical protein 

5rrr^Si:°^°^^^^^"^'^'^S**^*^^P°GPSMSDIErVEPTETEI0IDPGETV 
U.ELTPECREDCAVEVDYSHEDDEDPFSDRNRWRROGI IDPDANEW 

CPn_0380 426671 427876 

neiTN-Copr □porphyrinogen m Oxidase 

KSTIPTKTMKTLSAIAIACDAWSLIPMLKMSKAPLALVIHIPFCTKKCRYCSPrriPYK 

TLgss^??5^^^^^^2f'""^^^ps^vspLSKRI^ 

r^m??^J^?li:S2,^E^P^^^'^™S^«=^«*I^SSSAAITAW 

Ii?nii^^^^"°*^^^^^^^^^^^PDVPAKHNLYYWrDRPFU:LGVSASOYLH 

..MLTODVKLONLFSVHOOCLALNROGRLFHDTIAEEIMCYSF 

GPn_033l 42?836 428037 

trr326 similarity 

^J;f!?^^^-^^^^^^^2PP^^E^EP^PNPrPAOrOIPRITISPPSLDVSTVASSA 
ED 13'/F t ACCPRSSSSASVASDVYEL\rt:u:aGDEOPEPPDsSmTLY\^GS^^ 
ELLYr5EVPCE:A\mU.'/NrcSCMSPWPrSPCRTLWLDHPLC0^ 
REFLVirrcDA^rPYrCWALTOSRHSPRINAA/CISPTVFIOCDFRwS'R^ 




. , /;LA0EW I LA::^ ECLD W I CY i LJiHTTt^AVTl Y FFLLR'NYPO^R EP FRTAfl I VAO 

:^v[.p^•rr.vLVFu:<;Nw.fiK^^wPOEaRAIFtsA::Tr.^;cstvFvS^ 

VKVW/rcCfXlMrrVRASYRDRACF r tCFUJVVHCnv lpv^ti^^ 

w./urrAWDuiNKnAEEr^:;:x:Dvt^vrcTt/^Fiu:A(?l;[?^^^ 

V ii^ ;/y r .1. -::am • (Vr( ..rul<M)r. M.-r txyr. r.in^ t Ar.jse 

. .i.Um.O.,fM.U VA;.IJU .r.PJELVLTROVOSWRTTFXLCSVKaS rXKVPT I FLPH t PN 



':Pn_038) 4. : 

CT047 hypnthet iTal protein 

VOOTTFLTLPMOItSLTSrtWGQAVAeKVPAIALJ'i.'ULECDKOAlJCLW.'JC^^ 
(XLMPATLMSVriTFAUtJEHETLSXJHiiEXFPtAtKEyt^^ 

ECFRELSKALPSALSLSLf CEWPADROKR t IRLLLORAERVr: rSCSCSLAJLFLiULAST 
SLPOII^EFDKLLCTi/GKKTSLDHSDIKELVrnCKEKASLWKFRrSLLKRDPVICHOOLHF 
LLEDCEDPU; I rTFLRTQCLYCLRSI EBCSKEJJKHP,Kr/:.YCKERLHCAUCLFYAErL; 
KNNVOOP tVAVETLV I RMVKL 



nc'tti-Histcne-i ;t.c t'cucein J 

VITCLI RC r KMICAOKKCSCKKTASRAVTlKFAKir/AAKRn'KKA'IVRICTAVXKPAVracrA 
AKKTVAKKTTAXRT^mKWAKKPAVKKVAAXRVVKKTV'AKmAKRAVWC^ 

TTVAKCSPKKAAACALACHKNHKHTSSCKRVrSSTATRKHGSKSRWrAHGWI»K3t.IKm 
SR 

CPa_038S 434042 43252: 

pepA-LeucyL Aninopcpcidase a 

FOVtRCEFVVUTiAOASCRNRVKAnAIVLPFWHFKDAKNAASFEAEFE^ 

KTCEIELLYSSPKAXEKRIVUXMKIffiELTSDVVFTrYATLTRVl^^ 

ISElJU.SAEErLVCLSSGILSLNYOypRYinCVDRm.£TPLSKVTiaCIVPKMAnW 

AAIFBCVYLTRDLVNRNADElTPKKLAEVAU^XJKEFPSIiyrKV^CKIUIAKEXMaj^ 

VSKCSCVDPHFIVVRYCCRPKSia3mVLIGKGVTFDSGGU3LKPGKSKLT«^^ 

Vl^IE^AIJ^VLEU'INVrCIIPATElWIDGASYWCDVV'VGMSCl^ICSTn^ 

ADAITYALKYCKPTRI IDrATLTCAMWSLCEEVACFFSNNDVLAEDLLEASAETSEPLW 

RLPLVKinrDKTLHSDrAaOWLGSNRACAITAALFUJRFIXESSVAWA^ 

EEDRYPKYASGFCVRSILYVLQJSLSK '^w'*^ n«v 

CPru0386 434543 434046 

ssb-SS DMA Binding Protein 
KSKCYUIMFGHFACrft^PEERffrSKGKmTLRLCV^ 
DKHIJ»YUCKGSCWACOrSVESYMSKIX:SPOSSLVIS^^SLK^SPFtaWB^ 
hmOOVCYESVSVGFBCEALDAEAIKDKCMyACYGQEOOyVCEDVPF 

CPyv.0387 435329 434699 

CT043 hypothetical protein 

^»MttI^DSI>^SRC3^^AEEMJCNFAKEIJCLPOVAFD0N^ 

RLYVYAPLLOCLPrmORXlALYEKU£CSHLCGOl«CGCVGVATKE0LIUmC^^ 

AETTttlJCAFAOl^IEIVVKWRTVCADICACREPSVimiPQMPO^^ 

CPn_0388 435323 437320 

glgX -Glycogen Hydrolase (debranching) 

STMEKVSSYPSVPLPLGASKISPNRYRFALYASOATEVILALTDOISEVIEVPLYPOTHR 

TGAIWHIEIBGISDOSSYATRVHGPIOmGMQYSFKEyLADPYAKNIHSPOSFGSWa^ 

YArcyUCEEPFPWDCD0PLHU»ICEE«irYE»fVRSrr0SSSSRWAP^ 

HKLGINAVELU'lFErDETAHPFRNSKFPYirNYWWAPUIFrSPCRRYAY^ 

EFrn,VK|MQECIEVIl^^ 

L^mmAPTTOWlLDIUlYWVEQIHVIX;^R^)LASVFSRGPSCSPW^APV^£AIOTP^ 
ASTKI XAEPWDAOCLYOVCYFPTLSPRWSEWWPYRIXAnCAFUCtXJ^ 
QDIYPKGSPTNSINYVSCHDCFTLCinVTVTIHKHNEJUCEra^^ 
ajPCItXVREROLRNFFLTtifVSOCI PMIOSGDEYAHTABCanWRMAtDSN^^ 
TAWTl^IHFLCDLIArRiaCVKriJTmGrt-SNKEISWVDAHGNPI^^ 

AHVVVAFHVGAODOLATU>KASSNFLPYOWA£SQOCFVPOWATPTVSLOPOTTLIAIS 
HAXrVT 

CPn_0389 43S254 437319 

CT041 hypothetical prot ein 

SEVICVSDTFVTCOOTV\^PKIRVIi.SNESTTALrEAKCPYRrYGDtAaiOTAIOGO 

ALYBCrRV^EFYPCWCUCIEPVOTTASLFFNGIOYOGSLYVHRja)^ 

YUCSVI^IKYI^EUSKEAI^IIIOTALYEKLLARNPCJNFWHVKAEEEC^ 

FYGVEEAIDWARLVVDSP0CLIInA0GIX0S^Aa)RLAIECFT^AR01LEKFYKDVDFWI 
E5WNEEL0CEIR 

CPn_0390 439171 438134 

ruvB-Holliday Junction He 1 lease 

RKSDRECSYKrH0VA\aH0DKKFDVSLRPKCLEEFYC0HHLKERLDLFI£AAU3RCEVPG 

HCLFPGPPCLCKTSLAHIVAYTVCKGLVLASCPOLIKPSDLLCLLTSLOBCDVFFIDEIH 

RMGWAEEYLYSAMEDFKVDITIDSGPCARSVRVDIAPFTLVCATTRSCHLSEPLRARFA 

FSARI^YYSDODLKEILVRSSHUXIIEADSSArXEIAKRSRCTPRIJWHLLRWVRDrAOI 

REGNCINGDVAEKALAMLLIDDWCLWEIOIKLLTT t IDYYQGGPVG IKTLSVAVCEDIICr 

CEDVYEPFLILKGFIKKTPRCRMVTOUVYDHUCRHAKNLLSLOECO 

CPn_0391 439701 439510 

No robust homolog presenc in Genebank/EMBL as of 11/7/98 
KDOLYKQEKP t PKAT I LSRNLEVMLOWPKGKRQTLFtCRTSCRSALYSYSRR ILVLLNAF 

CPn_03a2 439814 440383 

dcd-dCTP Deaminase 

MS IKEDKW r REMALfUDM I HPFVNCOVtA^EETCEKL ISYGLSSYCYDLRl^REFKV^ 
VYNSWDPKCFTEDIF IS ITDDVCIVF PNSFALARSVEYFRI PRNVLTMCIGKSTYAHCG 
I^J^^^EWECKVT IE ISNTTPLPAK lYANBC I AOVLFFESSTTCEVSYADRKCKYO 

CPn_0^93 440229 440723 

(rr038 hypv^thecical protein 

WCFRliEEVMIKGWWrFSatOCWOnAIOELRTEELRUJSKVSSLCODILSA^ 
OW?LHLOI(WODaA/\tEA,\LIORLnLir.K<;YKKI/:V;7PKOoSENKD 

(•pn_0r/4 440727 AAl-u,H 

tlyC-^.B:; Cyjokiin protMin IHi.-fn*>lv::tn Mi«M>t(K|) 

MLLtTLiFrDr(:t^tArONCFAtLiv;r^^.a^n-;,:,.,.,jv[Trju:Eru»KAVA^^ 

VWEE.^RLLVJYt^Us-DCSVKERMO^f'OUirj.Yrjrv.vri-f.ErJLYIXKi^^^ 

a.NU/;r(n-An::LUj.DKrior:::ocL:,i.ixKKtvYMiM:rn;AK^^ 

0KYi;:;iEni.lT0EDLFElVAt:ElVU>UIKlt/riT:UVVbVMA;ari^2^^^ 
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Ct::^? hypornecical procetn 

CNCMTNSALFWICVN r :CI*/U^FY3MKEKAC/aFKRVRL0YYr.TKDHKKA«y INTLIRR 
PYRLPCTVNU;VNIAU:vCSE3SRNCYIUU:iTPCnrAPrP0IFrVVIFA^ 
PEKLM^AP I Lr^3HY t FYPL lOLICSLTECL'mXNIRXEXLNSTLSRDEFQKAUTH 
HEEOOm r ATM I FSLSATCAajVCOPLEQVTMLPSSANVKDFCRT IKNTO INF r PV^ 
ARKNV 10 1 AH PKDFVNKAI^EPL INNLHS PWTITAKSKL IR I U<EFRDNRSSVAWLNAS 
CEP IG I L3LNA r FK r LFNTTNI AHUCPKT rSVr ERTFPGNSR I KDLQKELOIOFPOYPVE 

■ri^y\:::: .:.:.^.t m*/ — rv •L.:":rCr/:::Krir.:_: 

'JPn.03'i6 444 ib^ 44ii41 

yhfo-Nifs-relaced procein 

yskiyldnnaktppergllefloktfli ecty anpssvholckksroi.vleashwm0kvl 
sf(x;rn^vtscateslnlaiaslpkdshvitscsehpailepuchsslsvsylnpeecrc 
vlt leoieravtpktsai ilgwvnsetcyucaoiaaiahfaqerolorrvdatanvgkeri 

yifkylolhoerisoelltkrngfekaikaripdvhihcadoprannvsaiarpplbcev 
wiau)iecr/accygsacsscatapfkslvsmc\n3eeltlj\*tuirsfsku^edverav 
gziekwerucns 

CPru0397 445124 4443B1 

PP2C phospfucAse family 

EHFVDFDYFGL£DICRVRARNEOFWOVNI>IS0WAIArX;VCGRI/X:DIAS0EAVTSLMEL 

IDEOOSKUWYGDDOYKETUCKItl^EVNawVEHGQKEEHLOGhCT^ 

FHVCDSRIYRIRBGEUlRLTEDHSLENOLXNRYGLPKOSDKVifSyRHILTNVLGSRPrVM 

PDIRm.PCEK£DLYCLCSDGLTt«VPOIDIROILN0PATL££RGrULISlJUmU3G0DNA 

TWLVRIO 

CPn_0398 445518 445700 

No robusc homolog present in Genebank/EKBL as oi 11/7/98 

I EELPKQIENSSIt^AEWMKWFIFSVISAPVVFLPGCTLIPKElCVTKVPSQLWSESLSO 

P 

CPn_0399 445759 446523 

CT2S3 hypochecical procein 

YKIi^VI^KSUJCESIDLXSKNTPRARIFOaSNLSTmRKMLVLLASl/^^ 
CTHLCSSGSYHPKLYTSCSKTKCVlAMLPVrHRPGKSL£PLPWNLWErTEEISKRrYAS 
EKVFLIKHNASP0TVS0FYAPIANRLPETIIE0FU»AEFIVATZLLEOKTGKEAGVDSVT 
ASVRVRVTDIRHHK lALI YQEI I ECSOPLTTLVNDYHRYCWNSKHFDSTPMGLKHSRLFR 
EWARVECYVCANYS 

CPn_0400 446527 447306 

CT254 hypochecical procein 

SKEJiSKFILLI^LCVAALASKNFFIWPAPSGKTPUOJlOVLFGGAIXVrSSLVAL^ 
TAELLSTKItSISLAFATlJTUJXPKDITRAILFSGERPVKTSWRALGS AIRMWI 1 1 IPV 
TOLrGIKMSKrLTLVLPTOEIHTQEVroEVONSLPITGHYISMILNU;\rt,TPFGEEVF^ 
GIMTFLKNKKTRIAAVLCSSl IFSFIHIEHSLCSWVFVPVLFVFSLSACFLYDmRHIL 
SPIALHCLFNLTSLLFU3IK 

CPru0401 447884 447495 

CT255 hypochecical procein 

MRI3HAFSKLIGTVRAMVVEGRCPWSL00SLVSM\^ILCBC0EFHEAVI^iaV0E^ 
AGDVLTL\^ILCFU£REGVLASEI7VANEAH£KLIUUIAFYIFAEDYKFVSI££A0RI^ 
AKHRaCNEST 

CPn.0402 449012 447888 

mucY-Adenine Glycosylase 

NFKRFCOTKIAFSEKAKNFPVEAUCKWFEKNKRSLPWRONPTPYSNWSEVML^^ 
VrDYFN(>(MERFPTrESIAAAKEEDVIKLWBCt£rySRARHlXBCARMVMEErHGKIPOD 
AISIAO IRGVGPY7^mAILAFAFKRRAAAVlX»AaAVLSRIFLI^^S IDI.ESTRTMVSR I 
AQAI^HKSPEVIAEALIEUIACICKKVPOCHRCFVRQACGAWRENKOFVLPVRHAWC^ 
IFU«U.VAIVLYDCSLVVEKRRPKn<MACLYEFPYtE\nEPEECUJDIBCFTKKMELSLES 
■PLEFUSNLKEORHAFThWKVHl^PIIFKATSLPOFGELHU^OIDHIJtf'SSGHKKrKDAL 
LIYLGDVRSRESIGV 

CPn_0403 449009 449710 

ycec -predicted pseudouridine synchecase family 

tn^WfroKRAALOYFMEIJFSWLATOVSRLSSFLRSOLPNHSKOEILASIROHRCRVNGF 

lERrESYKVOPCDRVSLSCIPSTKOOPSrLWEDDYSIiyEKPPHLTTEOMAHhfTRFFTVH 

RI^KCTSCCIXMGKSKOAATEUQaJKORKIHKOYIAFVFCHPKKKFGTVKSYTAPVYRR 

CGAVIFCAAGPSQGEP I KSAYKWDCWVILLSEMSTTDLKNSLPRSSALSSKLTP 

CPn_0404 450962 449871 

NO robust homolog presenc in Genebank/EMBL as of 11/7/98 

ELEAIXOKYGKAVLLI AL3 ELC I DrrMSLLSGHRLECFPPIAEVKAACDRCSMDFCEIUCS 

QSMOLWADAASCVDGLLODPFWSTAIASGIAKSSLQETEFECESKVMVXSSWCEOGAOVC 

SPFNLERICMSFPSLKVFSLKKNOCENMCIOLSASCKNLLMSIFrVATNCCSTPIWtTKE 

NLMALVALVLSHYOCYP/PATGDPORGNILGNPEVNAILARGMCHRVDLERKRGGESSSS 

RVLEU«^fiCFENSLTKTSU^•DAN^iV0ERDKCL^3MSTSLMHTACLNL0RPPVPTPSCVT 

.^HP0POPOPWTS0PSLLCARERSPVSSRCRFPVV^.Pt.SVI3PRSHFCRVERROLEDEEE 

EVMF 

CPn_040S 451814 45096^ 

CT105 hypothetical protein 

NrOTSHSRVLLKKFSKEFTrRTYRSLCrrDYLCCCLTNPLCKFPSPONPOWTrAPSSTT 
POAVSSAVCXIFUyrCGAASSTATmASCASACGLSPDOVOALLTNLUAroOPSVGOPS 
.':ACTGGAS.':S5;ASM0Q0Lt^Lrl^KTTCSCCSSV'SSE0L00LLSLVSO^^TS0OCSGGT0 
AC0AASVLLNLLSATC3AAANPLCTAASLA0I IYAA^>TSPGAKKTSEFCYNYCCETCOCN 
CGCPTCCCPOCOCCCGGFCRFFCCVWKNCCCrCEC.'SOEPAIPL 

t*iOI-Eiir>yi-Acyl -Cirrier l^roroin RtHluccase 

xriFHLK tDLTf:KVAFVA^UCDD0O•/GW^':IAKLU^EAry^Tl tVCTWrtYKIFSOSWELCK 
KNF-';RKt.':NrTrU.ErAK[YrMDA.':FD:;rF.DVPEDIAE«KRYKCITC:m.';r/AEOVKKDF 

. ;ii mtLvi[:;i^;:i'R r::K::LLtTr:nKCYtJAu:A.:;;YCFvr;i.L::i(F.:j iwmtnnTUiLT 

yi A':MRAVfY;Y';( » :M::::AKAALE.^nTKTU\WEAc':;<r'Wr;rRVNT CI.V.rL/ViRACKArOF t 
t:kHVDYY^;RWAr-£ I'F^NMNAEOVf yW.\/\Ft^.irLA;;A ITnETI.YVDIlOANVMC: tCPEMFPK 

ix: 

t 'liijMir/ .1'. t /V/ .|'.2HSj* 

HAtj !:ttp«Tt.tmi ly hy«lt o l.i:;o/plio::ph.u 



rr^CDAM EKLL'.TD I DTT :t I (LDKjr.VER LYALHCA^:WK l.Fr*wT:Rr/KYAAP SD 
FOAPYLUCCONCAbVW.l.TTw-NLLYJKSLF.'DLLJ I LCOCMErUTALFT/ESCAPYCOHY 
YRFSPTPIAODLHEyvnrRVPPNAKEHEZIrfETRIiUcmYAEPSflAMiCWCLftDCUIR: 
OKELERCEALTSVATKTU^HWBFOPftyAlttrLTOKaV^ 

SCODANDLDLIERGOFK r/M3SAPEEMHVHADFLAPPADKNC ILSAWEACVRYYOOLMSL 

CPn_O408 454090 4545^1 

CT102 nypocheciraL procein 

£F P P DTU 1 NH LUJ Hi H 1 Kij J J J JAT JL' r U KA ; v» K i ,K K^VITJ^* L 

CPn_0409 454645 455127 

CT260 hypothec icjl procein 

KTTWTt^NNX-TKFLKSSOEEPFLERESCLTYrNrOANCNELPLFTVTRSBCEILQLICT 
LPYOUlESHKASTARUitLLNRDr 0 r PGFGMDEECGLIFYRLVLPCUCEIHOTLUirYI 
OTIKLVCOSFSHAIGLISSGNMrfLDELRROALOEQQEKRNE 

CPn.0410 455087 455333 

dnaO-DI4A Pol III Epsilon Cham 

DVRLFKSNKKNVMSSOTKDVLI FV CTl t I'lVlV lERCRI lEIAAYNSVTDESFLTfWPEI 

PIPDEASKIHGITTDAVl^APKFPEAYBCFRKFCGEDSILVAWWIXn^FTU^KECRRH 

SLEPLTNRTIDSUCWAOKYRPDLPKHmflYIAQVYGFAEWAHRAI^CVVTLHlCN^^ 

GDLPPOQVLDU^SYHPKVFKMPrGKYKGOPLVDIPKSYFEWl^QCyaiJKPENKDIKA 

AIALLHQPT 

CPn_0411 455794 456609 

CT262 hypochecical procein 

RHOSRYSSITSTWILTAAFSPCPNDIFLFRSFUmpQFRPLLNQVTIADICTLNnALO 

RRI^Lhn<MSAALFPLVSOYYNIi<DVTOTLCYNSCPIVt^LCPECSLD^ 

U:KLYYPKAKLIPMPVDKILSAIL0CKVaX3ALIHEERrSYOWLTUWOFCELWRRCT 

FPLPUXIAIAKYVPHATVDALTAAUUCSLICSIJmPITACyUCAVEYSKNKN^ 

GTYXNlCrrFQLSlCTGKXALKMLWKANECCOyT 

CPn.0412 456515 457246 

CT263 hypochecical procein 

EPISTKKPFVYUCUaOCLYICSGRPMNAWrPKKrLCIVADYREISPLIEOLDFTOINEH 

LySYRCTDYHrrLYIV>f\n«STAVUWWSYCOAYTDYDLWIMPGr\^ 

TIEIUANLTTt7rPPVLSEDPPYIFDAIJ»DSLPKSSLVTSPVLYHYGFHJnTTCliJM^ 

IASOAA£HHIPCSFUCITSDYTVPGDCPFSRLEEVSOKLTOT;.Vm^ELMERAIPPKLL 

LPCP 

CPa-0413 459209 457227 

msbA'Transporc ATP Binding Procein 

VPHKLUJCAVUVOCMiLVILGCSUAILGLTTSSOMCIFSL^ 

GKLVKVSEt^QKDILEWMSKDSCrLTVSDATTYIAEHGKSTASLTSKUKFVW 

VSRfRCLAIFLICVAIFKAVTLFFORFUXJWAIRVSR0LRODYrKALOOU>KrrrHITO 

rcm.S^mV^m5SASlALAV^«LKI^nflOAPITFILTLCVCI^ISWKFSILICVAFPlrIL 

PIVVrARKIKNLAKRIQKSQDSFSSVLYOFU«n«TVJC\^TEKFAFTIC^^ 

EEKSAAYCLLPRPLLHTIASLFFAFVWICIYKrAI PPEELIVFCCLLYLrVDPIJatlCD 

EJffSiyOlOCAAAERFYEVLNHPOUiSOKEREIErUJLSNriTFQWSFCYOEDICHrU!^ 

SFTIi«CEALGIVGPTCSGKTTLVKtiPRLYEVSOGKILIOSU>ITTO<ICCSL^ 

L0NPFLFYI7^VW^MLTCCXIM£££AVL£AUCRAVADEFI UCLPKCVHSVIXESCa^ 

QQQRIAZARAIIiCNASILIUEATSAIJlAISENyimiCEUCCOCTOIZXAHXLTI^ 

VDRVLYIEIKXJKIABCTKEELIWPEFLK>WElOTTCEYNRVr\^D^ 

T 

CPn.0414 460203 459172 

accA-AcCoA Carboxylase /Transferase Alpha 

LCUlIVCIKMIIJIRGEHILKEU*PHEKOWEY^KAIAEFKEianOa«LI^SErOKL£K 
RLOKLKEKIYSDLTPWERVOICRHPSRPRTVNVIECMCEEFVELCCORTFRDDPAWOCF 
VKIQOORrVLICOEXDCOTASRtilRNFGHIXPECFRKAUaCKIJkEKFXn^^ 
AYPGLTAEEROOCWAIAKTJLFELSRLATPVI IWIGBCCSGCALGHAVGOSVAMLEHSYY 
SVrSPErXTASILJOTPKKNSEAASMUCHHGENIJCOFGirDTVIKEPXayUmDPAW 
VREFI lOEWLRLKDLAIEELLEKRyEKFRSIGLYETTSESCPEA 

CPn.0415 461522 460221 

CT266 hypothetical procein 

SOTGFLPCLTLin/I I IVWCNAFLIKLCVIMGLOSRLOHCIEVSQNSMFDSOVKOFIYAC 
ODKTUIOSVUCI FRYH PLLKIHDIARAVYLlJlU.EECEDU:LSFL^AWYPSGA^raXSC 
GGFPWKGLPYPAEHAEFGIXLLOIAEFYEESOAYVSKMSHFOOALFDHOCSVTPSLHSQE 
NSRLLKEKTTLSOSFLFOLGMOIHPr/SLEDPALGFWHORTRSSSAFVAASGCOSSLGAY 
SSGDVCVIAYCPCSGDISDCYyPCCCCIAKEFVCOKSHOrrEISFLTSTCKPHPRtmiFS 
YLRDSYVHLPIRCKIT ISDKOYRVHAALAEATSAMTFS I FCKGKI*;C0WOGPRLRSCSLD 
SYKGPGND I M ICCENDAIMIVSASPYMEI FALOGKEKFWNADFLINI PYKEBCVMLIFEK 
KVTSEKCRFFTKMN 

CPn_04l6 461871 461557 

himO/ LhfA-lntearacion Hose Faccor Alpha 

EALSW<ATrmCKKLIGTI50DHKrHFtmVRTVrONFLOK.^frDALVKGDRLEFRDFCVLOV 
VERKPKVCRNPKNAAVP I H IPARRAVKFTFCKRMKRL lETPNKHS 

CPn_04l7 463047 462244 

amiA-N-Acecy Inturantoyl Alanine Araidase 

REKGMKLTKYLNTKOLRSMISRLFVRYSLPMSKOLSFFALCVLCSHPIFAOTPNPPORVR 
R:; EVt F r DPCHCCKD0CTA3KEU<YEEKSUTL£LALTVi^.T/LKRMrjYKP0LTRSSOVYVO 
LCKRVAtJNRCCXIDVF tS I HCNHSSNAAAFCTEvnr rVWr/GSPTRNRMSEVLCKN tLAA 
MEKNC I LKi^RGLKTANFW IROTSMPAVLVETCFLSNSP ERAALODARYRKHVAKC lABC 
VMNFLTTCPl^FOKPKCNCAK fRKPOIQAN 

<:r'rt_(ltlH '11*4401 4n2*»53 

tniirfc: fi Ai:t>rvlntut.imiivl<tl.inyl'jliiCJfliyl DAP Liu.i::if 

MIJI J<KI .IJfOVOAK I Yi IKVnPLEVWJLTROSRCV.TVGDtF lAIWnTjP'fWMUyAVOAUiiNC: 

A I A r A: ;::i.YNi'Kr^:wy [ (TPf/i j:£LEAEL.';AK*rrEYPr:~KLHT uvrTtmrirmrcLt 
KA( .1 ji:;y'.iKi-; a :i .u rr rnir ii":tx.v i KDnnTrTPALLor/LAThf/MfjNRDAryMEv::.'; t 
' :r A'j U'.vA'rwn tr,\vi ;pn iTr.wir jjrKnTFErn/AAKAKLFSLViPia'jwv rirrospYA 
: :(jt • ( F- : AKAi V r -n** ; ( i:: tAAUY patd rou •;:.':'JTKYTLr/GOyx i A' ::::;::f fiK'nwYNL 
( AA I :ri'vi lA; :i .m i u .ki»i .i.i:k ir :u:opi*h inuDpVLMOPC w r dvai riTiiAi Jiwr.Tnt. 

IIKt 4 .fl / y :UL I WI-. *. > :i •Ul ik:;KhKLMAOWERYCFAr/Tr;DNlh::Kf*f'Kr/r/NE [I.W 

r/: :knyk i r. i i tu koa i tvai ^ : t a: f> i vi. i fV;KCiiEAYO i fki ir/r/AKW^cypA 'kvla 
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pop J - c rancMlyrolosfl/r ransp-pcidase 

OLFFWi rwru POKKVSVTYPMSYRKRSTL rVXX:^ALYALLVLRYYK 10 ICEXS)HWAA£ 
ALCOHEFCVnDPFRRCTFFANriVTlKCCKDLOOPFAVDITKrHLCAOPLAIPEEHRCEi: 
: U}F lECCTYCCUJUtLOKKSRYCKLYPU^SVHDRI^LWWKCYATKHRLPTNALFF 
ITDYORSY PFCK LLCXJVUrrUlEI KDEKTGKAFPTCCMEAYFNK r LECDVCERKLU^ 
NR Ummv I KLPKDG^D rYLTXNPVrOTI AEEELERCVLEAKAOGGRL t UCiSCTCEILA 

: A. vr- *"*;■•:■;:.•; rn::;::- :i;;r: ::7::r;;*:.v7^rv:.::MKr:. :a-:a::p:ea.:lk:;: 
;K? rr:rr w-K- f.^'i: :i ■r:.\:^M\>»A I v'-'.'.;rr".-/A':;r-M:'F: 

VAWYQ0KUJ^FCRKTCra.P3EASCLVPtfPHRFHItCSt^Sli-TPV3LAMGYNILAT 
Cro«VOAYAILANa;YAVRPTLVXKIVSASGEEyHLPTKEKTRlJ'SEErrREVVRAMRFT 
TLPOCSCFRASPKHHSSAGKTCrrTEKMIHCKYiaCRRH rASFICFTPVESSBCaff'PPLVKL 
VS lOOPEYGLRAIXTrKNYMGCRCAAP IFSRVADRTLLYU:iLPDKKUlNCD£EAAAI^^ 
YEEWNRSPKQOCTR 

CPn_0420 467120 466824 

CT271 hypocheticai procein 

KSFPf©IKSRFLRI^CIjCFCGSLrYFYrNKOMSLTKIJU*EIPCLSVRUlOt£OONISW 
L IDKI ERPDHLMEIAALPEYOYLEYPSEES ISU.SYELP 

CPn_0421 469007 467108 

yat3<:-PaP2B Family mechylcransCarase 

EI LMSERAHI PVrt.VEECLUrf AORPPCTn^RDVTLCy^XniAYAFLEAYPSLT^^ 

OALAIAEWlLETFODRVSFSHASFEDlJtfJOPTPRLYCGVljaSLGVSSMOLITrLSRG^^ 

GEKEELDMRMEXTTOELSASDVUiSUCEEXLtSirFREYCEEPQWKSAAKAVVW 

S lODVKEAU/r^PKYRFHRKI HPLTLI F0ALRVY\TCEDROUSU.TSAISWtAPOGRl. 

VI ISFCSSEDRin^KWFFKEAEASCLCKVITKJCVrtQPTYOEVRiWPRSRSAKLRCF^^ 

CPn_0422 468233 468784 

CT273 hypochecical protein 

GIAMVEIFNYSTSIYEOHASNimrVSOFllKEIWBSrSIRDW«HAOIU^^ 
La7n«KSHWACFSPP^^^FYK0RFSTPYlJU•SU:SPDCX}DEDIEKISSFLKVLTR^ 
RSOITPFI^YKDKEEEnJEBPEEmOOPRVOQCaCVLIJWLDLGVKSmWir^ 
FVQG 

CPn_0423 468788 469216 

CT274 hypochecical procein 

CMIXNEWKAIUWCDDErXELRISGYSFLROGHYSKAILFFEALVILDPt^rYI^^ 
LYLO IGENS0A1AV^D0AIJW3DHLPTU:2^KTKAIJ'CU;RIEEAT^ 
ANDAEAU-MSYSKATKKNAALVR 

CPru0424 469528 470961 

dnaA-Replicacion Initiation Factor 

SRaJErFSPSIl<CWVDCIWESFINKESGMLTCNECTTWEOFU;^^ 

IQVLEETOEKIRLEVPfarVOWLLCMYKRDICSFVPLDVTO 

ASOKESNBCISEVFOTTOFELKIJa.SYRFTWFIBCPSNOFVKSAAVCr^ 

FIHOGVCaXIKTHUilAVCHYVREHHKNLRIHCITrEAFtNDLVYH^ 

UJlXLVDDIOFWNROKFEEEFOTITETLINLSKOIVITSIKPPSQUa^ERIIA^ 

LVAKVGIPDICTWAILOHKAEOICLLIPNEMAFYIADHIYCNVROIXGAINKLTAYCRL 

FCnCSLTETTVRETUCaJTlSPTKOKISVETILKSVATVFQVmJDL^ 

XAHYlJUCrLITDSLVAIGAAFGKrHSTVLYACKTIEHKLQNDETLK^^ 

CPn_0425 470965 471564 

CT276 hypothet ical proteins 

FRCCPMFRRTCKGPFnJVOTLYEEETSSPSSVSPYSRSERPETPPSLFDNPKASEARPLN 
HNLTEKSLPaWSSrPRTESa.PI^PETTLGEGVTFKGELAF£RLLRIDCTFEGILVSK 
GKII ICPKCWKADIOLOEAI lEGWESNITVSQCVELROGAI IKGDIQANTLCVDECVR 
ILGYLAIACITDHSERERDL 

CPn^0426 472111 471536 

CT277 similarity 

MVLFSLLFPKLCYGC0APGAYFCSNCU3CII.VEDRECRCLHCFRYLCSS 
SOLOAFSLYLPSOTALSVYARACEGKRPALOFFSKSIAFELASLDETPSCIAYITSTISR 
KIVVEVAKLEKLLRIPI>rPWLPKKROiaaPKCECICFLSAYPLSOKWMOTIVGGSASPL 
VSISLFLSQNDQ 

CPn_0427 472153 473715 

nqr2-NADH (Ubiquinone) Dehydrogenase 

AVCYVFERVEASTFLSITMLKKFINSLWKLCOODKYORFTPIVDAIDTFCYEPIETPSKP 
PFIRDSVDVKRWMMLWIALFPATFVAIWNSGLOSIVYSSCNPVLMEOFLHISCFGSYLS 
WKEIHIVPILWECLKIFIPLLTISYVVarrCEVLFAV\niGHKIABOLt.VTGILYPLTL 
?mFifWMAAIX;iAFCrWSKELFa 

CVtKDSU4KKNSSTCKVLIDGFS0STCL0TU/STPPSVKRUiVDAIAAN^ 
vrHSOFSLWTETHPCWVLDNLTLTQLOTFVTAPVA£X3GLCU.PT0FDSAYAITDVTYCIC 
?l^?^'^''^"°^"^^^f''^^*^I^IVrGIASWRTTWiFGrCAFt.TCWLFKFIS 
'/LIVCQNCAWAPARFF I PAYROLFLCCLAFCLVFMATDPVSSPmKLCKWI YCFF IGFMT 
IVIRLINPAYPECVMLAILLCNVFAPLIDYFAVRKYRKRCV 

CPn_0429 473719 474681 

"nqr3-MADH (Ubiquinone) Oxidoreductase. Gamma* 

NMSKGS3KHTVR EhKnW r VSF ILCLSLFACVU^ I-rrVl^ P IQEOAATFDRi™ 

ACYROLLINFSNLTHEKKTCE 

::Pn_042'i 474t',6n 4753 W 

ipK'»-NAnH (ithit|uinonei Rtitiuccase 4 
KENPF'm'=KK:;vKJYFFnrLwnw4CjiLiAiu:iL\-;Auv\rrTTv^ 
r:i-ry:xLRKFTPn>:vnHiTOLnt:;i.FV'tviDCFLKAFFFDi::i<Ti^:vFVGLrtm^ 

• :».^H.'.LARnVTr' [rAFUX^FASCU.'YiTWVLLVIC'/r RELFTFCTLMCFR I tPOFV^AJET 
n.|r'> riADll (inMitULnitiic) fU<tlii..'C S 

KMWI /;AY'iV/( /IVI f : I LLUAAF ION I LUWU iMCrJYI^*: rrRV.Tr/VA ;U WL-VALVLTVT 



cm_04 } I 

No robust hoiiiola 
K EtfrarKYVPRSRONPDTLTFLKRYSJVLLHSENGLSYR tFAJCVXArLLTSLAVAFAVI 
LFSCBC JOURLCAI-Y 10 IAt-WCVt.:.Tr V.-YCIAJK I ATAJKKPPS ISR lEIV 



CPn_04?: 



47'i817 



47h5U 



OLf LTCiAPU/UJ I W 1 AAJ.. ; TLJMUVCAv.'Vi'h Y K IJNALtiK rKVAHti 

CPn_0433 4773:7 476929 

9C8H -Glycine Cleavaqe System H Protein 

RTFRILYCTLYRTCSRKVMWYSDYHWILPVTiERWRLSLTEKMOKNLCyklL^ 

SLCK£CEVLVILESSKSAIEVLSPVSCEV:0IKLCLVDNP0KINEAP£CEI5«AVVI^ 

DWDPSNLSLWDEE 

CPa_0434 479471 477276 

CT2B3 hypothetical procein 

RIWRIYOODLFCRLCRDPAWFFSlXSFTLRrYCLGRGWTLLSFrfKHOKKriCrVIAW 

CVSCIC^^X^CRFSRKGSAESTSRRTVFTrASGKRYVEKDFMAMKKFrAHEAYPriGNPRA 

WNFINECLiTOYFLrrRVCEKLFIJC\nfHPGEK:FSKEKAY0PYRRFnAPFISSEESftIK^ 

APOUXILKVF0OIENPISKEGFLARAKLFLEERRFPHY\^JlO^II^YW^0MFAIJ'PDE^ 

SRGKDLRlJGYOTIOOWrCDAYLSAAVELLIRriDEOKKVLPRPSKOEARDOFYOKAICK^ 

YTKISKMCEFSU;FEEFVNSYFOFI£ISESEFFNMYROIU.CK3ULLU<QGCVSFDTO 

TTFFVOGKDS lOVEFFRLPKEYSFKTKQELKAFEVYLKLVSLPKSDSLDVPNEILPIATI 

KAXEPRLVCRRFSIDYKRVAWDU^ATVPKVEVLHWIWNSEHFXJEILTO 

DFOHUCPAIJlDKISUTRKEIIJlARPERILOSIOOVPKOSOCVU^ACKNSALrciS^ 

0IJUCVU£NEVU3LYS0nAETYYTirVNSSFEKEEVLPYREVIJ^ 

MERt£SALRTRYPGEEGASLWRRI>WVVElWRU;RHLECSFSWSLORSt^ 

POEFPRIFSMXVCI>YSSVFMSPNBCPCYYOCI^HU,YDRPASVDKmJUCSOLDEELLK 

YMERFIEOGWR 

CPiX.0435 480908 479475 

Phospholipase 0 super family [uneleavable leader peptide I 

GVWtSRLRFRIJUa/JIFFrLLVPNSVSAKTIVASDKEKVCVLVYIWSVEAFTOIL^ 
ANmaCPCHIGGRTlJCPtVDHLEARMDLWELCSYiriQPTFTnACTXjit ^r^ ^ 

PWRFFYVrTGCPPSTS ILAPNVIOWIKI^ iin;KYciLOGTMr£Em:-r w ^ ut .vy t K V D 

NPRrJT;SCVWU>LAFRDODIMLRSTAFXajOUlEEyHKOFAMWDYVAHHMWFr^ 

ACPPLTMAEElVFPGrDKHnJLVLVDSSKlRrVUXJPHDKOPNPVTOCVUaJQQUlS 

SVKLAHMYFIPKDELLNALVDVSromCVHLSLITNCCHEI^PAITCPYAWGHR 

YGiatYPLWKKWFCEKLKPYERVSIYEFAIWErrOUfKKCMriDDEIFVICSYN^ 

DYESIWIESPEVAAKANKVFTJKDrGLSI PVSHC01FSWVFHSVHHTLGHLOl.Tnn»A 

CPn_0436 '481633 480902 

IpIA-Lipoace Protein Li^ase-Like Protein 

FYVCVMKTOIVDSGKSSAASHMAKDROliESI^IXSElJUiLYEWEOTCSLTYta^^ 

FU^ADLGLDAAVRPTOarVFHXCDYAFSVUlSATHPSySSSVLEl^^ 

LEKVTRIOtaOAPEDENSSSIUJSQnTHAia'SICYDVU^D 

LFLSGSSSErrORFUCPEVLEEIIEQIOIHAFFPLClXAADEVLQEARQCTVKW 

CSCL 

CPn.0437 481810 484350 

clpC-ClpC Protease 

AROEVERLIGYGPEIOVYGOPALTGRVKKSFESANEEAStXEMJYVCTEHLLLCIU^^ 
SVAL0VII^IIi1IDPR£VIUCEIUl£LETFNL0LPPSSS5SSSS5RSKPSSSKSPLCH5LGS 
DKNEIO^AIJCAYCYCfLTDfVRCSKZiSPVIGRSSEVERLXLILCRRRKmPVLICEAC^^ 
TAIVBCLAOKIILNEVPDALWCKRI,ITLDlALKIACTKYRGOFEERIKAW 
LIJriOEUfrrVCACAABGAIDASNIUCPALAIWEIOCIGATrrDEYRKHIDCnAAL^^ 
OKrVVKPPSVDCTIEIUlCIJaCICYEEHHNVriTEEJUJCAAATl^ 
LDEACARVRWmCOPTOLMKLEAEIDmOAKEOAICTOEYEKAACUUJEEKKU 
SMKOEWDWKEEHQVPVDEEJVVAOWSUnCIPSARLTEAESEKIXKI^^ 
OAVTSICRAIRRSRTCIKDPhmPTCSFLFLCPTGVGKSIXJTOIAIEKFOCniALICJVW 
SEYMEKFAATK«K3SPPGYVGHEEIXmTE(:VRlWPYCVVLFDEI£KAHPOIIC3L^ 
EOGRLTDSrGRKVDFRHAIIim^NtXyQLrRKSGEIGFCLKSHMDYKVIOEKIEHAMKK 
. HUCPEFINRI^ESVIFRPLEKESLSEIIHtXIWCLDSRUCNYCXIALNIPOSVrSrLVTO 
HSPEXSARPLRRVIEQYI^DPIJ^ELLUCESCRQEARKU^ATLVDJRVArEREEEEOEAAL 
PSPHLES 

CPn_0429 485455 484334 

ycbF-PP-loop supereamily ATPase 

NLTLPMP.FOVREIMOOTVI VAMSOCVDSSWAYLFKKFTNYKVICLFMKNWEEDSBOCLC 
SSTKDYEDVERVCLOLDI PYYT/SFAKEYRERVFARFLKEYSLCYTPNPDILCNREIKFD 
LWKKVJEUMDYtATCHYCRUrrEUJETOIXRCCDPOKDOS'^Ft^CTPKSALHNW 
CEMNKTr/RA I AAOAALPTAEKKDSTG ICFIGKRPFKEFLEKFLPNKTCNV t DWCTKEl V 
OOHCCAHYYTICORRGLDUXISEKPCYVVCKNI EENS lY rVRCEOHPOLYLRELTARELN 
WT*PPK3OCHCSAKVRYRSPDEACTrDYSSCDEVKVRFSOPVKAVTPO0TIAFY0Ctm:L 
CSCVIDVFMIPSEC 

':Pn_04i> 495523 486077 

No roL'iSt homo log present in Cenebank/EMBL as ot 11/7/98 
I ISSNHP.'/LFVSSTLNCVFPSStPEESADLFITNKEIVALCEKCNVFLTHSI PMHIAAIT 
I LVIVAtAG lAI ICtCCYSeS I LL lAVCIVLT I LTLLCLOALVCFI KFI ROLPQOLHTTV 
OF IREK XRPESSLOCVTNAORKTTODTLKLYEELCDLSOKEFKLOSTLYOKRFELSHKNE 
■ KTHQN 

CPn_044') 4(J»in»!l 4flt;740 

no rot^.-zr. homoloi riccnnr in '';<jii'.*h.inlc/EMBL Ofl ot ll.'7/0« 

LAT t fw';:5iMAT:WAP>:rvPF^::;pL.':i (atevlmlwiay tTOPiip t paapwetfrsklstkh 
Ti^:FAL:uxTucTi.:Aiiv.v;rrr:Nwt i(.-<;rr:u;r ivrTLiLALLuvrrucNKonTTncL 
£ UF. tr/y.^ I r.s ic:» ifvqr YituHF.Tr r k: :vi il( 'Elttonqektp i lne i eakket; conlel 
K[TBc:or;KLAOKOPKRK:;;:oKr;F:<p::rKitr^:KNPvrr.Fn(: 

'rrnnv .-.yptuiutr it:.il t*t^«r<rift 

i^wMiu.:54FKU.nirAAKAi:iivt.TrP(iwvMiAf;r;t(iKiw>rfirMM'Kr'K:;.\cMj^^ 

v/ArKyLT':FYNYVLL::u'^YTLr;u<f ft*.*/: : t r u-/ iw^ui-m t rk ;YCi.Yyi:vL:;r:KYOAT 
i-jcij:ArfV/TNCTt:uiOFj<,\wfM/y'n/::YKATf/^t;r(JK:rnw::[r/^H:T::v! 
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K LTR FR K K I J f iHL I - : ;pc r r rr/ :hf.i eat r/K , r< f i Kcr/'' w:; t D i s t aodh 
NNNKTSi irrKT: :af ftv::;:avmhk 

■-pn_04»2 AHirt2 4R4S2R 

CTOOh hvcyirnccicJl prorein 

N I LFOTCMGFKt I ICKOC^OLYLMf: I FPER I LARKLKNCAKSYPRTALTI r/LVSSVLCA*- 
KV t L ; pf-AGr^ AALTL PLR ALFI lA I KTK3C0H LA3Y/\MAWLLH I L7 1 AV I IGLVFS LVF I 
PPPVyFr.tt>':LLM,^/TT.T/TLF''//HKNt,FPPYEPPP::RPKrPPPFAOr^PLlSESYrD 

t.TOU^TftypocneLicdi procein 

VDb'MSOPPINPLC0P0VPAAA5PSajPSWKRLKTSSTCt.rKRFITIPDICYPKMRirvm 

C: lAJJWVIAIt^ aOTASGNSLKLVAlJ^AZ-ALCyU^LilSOrLOSPKA^ 

IWP 1 IVIAIAACL lAGAFVASSGTMLVFANPKFVMCL ITVCLYFKSLNKLTLEYrRREH 

LLRMEKKroCTAEPILVTPSADDAKXIAVEKKKDt^ASARMEEKEASORODARHRRIGRE 

ACXSSFFYSSRNPEHRRSFCSI^RFKTKPSOAASTRPASISPPFKDDFOPYHFKDLRSSSF 

GSGASSArrPIMPASSRSPNFSTCTVLHPEPVYPKCCKEPSIPRVSSSSRRSPRORQDKQ 

QOOONODEEOKOOSKKKSGKSNOSUCTPPPDGKSTANI^PSNPFSOCyDaiEKRKHW^ 



iIR^YrJACUVlKFCF 



CPn_0444 



49026« 494S07 



pmp~6 -Polymorphic Oucer Membrane Procein 

KA^ibRHMKYSLPV^LTS3ALVF3U^PLMAANTDI^SSDNYE^GSSCSAA^^AKE:^SDA^^ 
ffmTLTSDVSITNVSAITPADKSCrrNTGGALSFVCADHSLVUTTIALTHDGA^IM 
■ TAl^FSCFSSLLI DSAPATCTSOTKGAICVTNTECCTATrn)NASVTWKrrrSE3CDG^ 
SAYS IDLWCrn'AAU.DCNTSTWXWALCSTAtnTVOCaJSCTmFSSNTATOKOT 
EKDSTLDA^TCW^FKS^^•AKTCX3AWSSDD^JlALTC^^W^0El^^ 
CGAXCCYIJVTATDKTGtJ^I SONOE>tSrrSNTTrAhCGAIYATKCTLtXOTI^TnX}NTAT 
ACCCGAIYTHT^5FSLKCSTG^vr^ST^^^AXTGGALYSKGNSSL^tam^i^S 
SNSSANOECCGGAI LAP I DSCSVSOKTCLS I ANWEVSLTSNAATVSGGAIYATKCTLTG 
tCSLTFDGMTAGTSGCAIYTCTEDFTLTCSTCrnn'FSTWrAKroGALYSKCN^ 
LLFSGNKATCPSNSSANOEXXGGAXLSFLESASVSTKKCLWimpi^ 
ArYATKCAUfGNTTLTrDCWTArrAGGAIY'rL'l'tiJt ILTCSTCTVTFSTTJTAJCrAGALHT 
KGin-SFTKNKALWSCaJSATATArrrnWEGCCSGAI LCNISESDIATICSLTLTEJ^^ 
INOTAKRSGCGIYAPKCVI SOSES INFDCNTAETSGCAIYSKNLSITANSPVSFTNNSCG 
KCGAIYIADSGEI-SLEAIDGDITFSGNRATBCrrSTPNSrKlJSAGAKITKrAAAPGHTrVF 
YDPrTMEAPASGGTIEELVINPVVKAIVPPPQPKNGPIASVPVVPVAPAWPr roriV^ 
GKU»SOnAS I PANTTTI LNQKINU^X^JVA^BC^TLOVYSFTOOPOSrVF*^^ 

TTnJNTDGS idlknlsvnu>aux:krmitiawstsgcucisgoijcfwwecs 
kanlnlpfldt^stsctvnixdf>ipipssmaapdygycx;sv^lvpicvgaogk^ 
algytpkpelratlvpnslwnayvnihsiooeiatamsdapshpgn^iggignathqdko 
kenacfrlisrgyivggskttpoeytfavafsolpckskdyvvsdiksovyacslcaoss 

YVrPU^SSLRRHVl^m,PELPGETPLVU^GOVSYGRNHHN^f^TKLA^af^OGKSt»ro 

favevcgsl?vdlrrfryltsyspyvkl0ws\w:kgf0evaadprifoashlwvsipm^ 
ltfkhesakppsalu*tlgyavdayrdhphcltsltngtswstfatnlsroaffaeasch 
lkllhgldcfasgscelrsssrsynancgtrysf 

CPru0445 494739 497579 

pmp_7- Polymorphic Oucer Membrane Procein 

FNFLVSKKCW3MKSSVSWLFFSSI PLFSS LS I VAAEVTLDSSNNSYDGSNGTTFTVFSTT 

DAAACTTYSU^DVSFQMAGAIiJIPtJ^SGCFLEACCDLTFOCNOHAIJCrAFIf^ 

VASTSAADKNLLFNDrSRLSlISCPSIXLSPTCQCAUCS\rail^LTaiSOriFTONFSSD 

NOGVnnTNFLLSGTSOFASFSRWAFTCKOGGVVYATGTITIENSPGIVSFSQNIA^ 

GGALYSTIircSrrONrOVIFIXINSAWEAAOAOGCArCLrriUmTLTGNKNt^FTN^ 

LTYGCAISGLKVSISACCPTLFQSNISG5SAGCXXX;GAINIASAGE IALSA TSGDrmiN 

NOVTtWSTSTRNAINI ItTTAKVTSIRAATGOS lYFYOPITNPGTAASTirrt^^ 

EIEYOGAIVrSCEKr^PTEKAIAANVrSTIROPAVLARGDLVTJlIX;VTVTrra 

RILMDGGTTI^AKEANI^LTCtJ^VNLSSLrXTnJKAAUCrEAADm 

rrElOCIIJCSASTYPIXELTTAGANGTITU^AI^LTLOEPETHYCYQ^^ 

KIGSINWTRTGYIPSPERKSNLPLNSLMGWFIDIRSIWLIETKSSGEPFEREL WL^I A 

MFFYRDSMPTRHGFRHISOGYALGITATTPAEDOLTFAFCOLFARDRNHrrGKNHGCTYG 

ASLYFHHTEG[J■DIANFLM;KATRAPWVl^EISOIIPt^FDAKFSYU^^D^^ttQCTYYTDN 

31 IKCSWRhfDAFCADLCASLPFVISVPYLLKEVEPFVKVOYIYAHOOOFYERHAECRAFN 

KSELINVEIPIGVTFERDSKSEKGTYDLTLi^YILIUYRRNPKCOTSLIASDANWMAYGTN 

LAROGFSVRAANHFQVNPHMEIFGOFAFEVRSSSRNYNTNLGSKFCF 

CPn_0446 497602 50041S 

pmp_8- Polymorphic Oucer Membrane Procein 

LI EPKHLSHKI PUiKLL r SSTLVT PI LLS I ATYCADASLSPTDSFDGACG3TFTPKSTAD 
A^Xr^NYVLSGNVYI^^3ACKGTALTCCCFTETTGDLTFTCKCYSFSFNTVDAGS^^ 
TTADKALTFTGFSNLS F I AAPCTTVASGKSTLSSACALMLTDNGT t LFSOMVSNEANNNG 
GAITTKTLSISCNTSS ITFTSNSAKKLCGA lYSSAAAS I SChrroOLVrMNNKGETGOGAL 
CFEASSS ITONSSLFFSGNTATDAACKGGAI YC EKTGETPTLTISGNKSLTFAENSSVTO 
GGAICAHCLDLSAAGPTLFSNNRCCNTAAGKCGAIAIAOSGSLSLSAN0GOITFLCNTI.T 
STSAPTSTRNA I YLCSSAK ITNLRAAQGOS lYFYOP TASNTTGASOVL-T INQPOSNSPLD 
YSGTIVFSGEKLSADEAKAADNFTSILKOPLALASGTLALKGNVELDVNGFTCTEGSTLL 
M0PCrrKLKADTEAISI.TKLVV0L5ALEGNKSVSrETAGANKTir-TSPLVF0DSSGNrYE 
SHT INOAFTOPLWFTAATAASD I YI DALLTS PVOTPEPHYCYOGHWEATWAOTSTAKSC 
TywmCYNPNPERRA5:WPO5LWASFTDrRTL0OIWr30ANSIYQ0RGLWASGTANFF 
HK0KSCTN0AFRHK3YC\'IVGGSAEDFSENIFSVAFC0t.FGKDKDLFIVENTSHNYLASL 
YLOHPAFU-iGLPMPS FCS ITDMLKD I PLI L^OLSYS'n'KIIDMDrTRYTS YPEAOGSWTNN 
SGALELGr::;LALYLPKEAPFFCCVFPFLKFOAVYSROONFKESGAEARAFODGDLV>*CSI 
PVG IRLEKIilEDEKNNFEISLAV IGDVYRKNPRSRTSLMV.'IGASWTSLCKNLAROAFLAS 
ACSHLTLnPHVEL.XEAAYELRCSAHrYNVDCGLRYSF 

CPnJ/44V 500541 503351 

rj"'P_.*-f^>lV'norphic Oucer Membrane Procein 

ryK PP [ ALYMK53U IWFL r .':.^:3L.\LPL.^l^FS AFAAWE I riLCrTNSFSGPCTYTPPAQT 
TMACCT fYNLTnW:-: tTNAi;3PT.ALTASCFKETTCtfLCF0GHGY0FUj0N I DACANCTFT 

^f^AAIlKU«*;^:;^>F:r^'L::LtcTmAr^CTGAtKCTCAC3Ioc^^^5CYP^^N 
• r:::.:;:^iPffLTrAKNKATOKa^At.YrrtTx; it trn rrLN^ACFSENTAANficcAiYTEAS 
: :i • r r:::i ika i ; :f i m: vrAT:%vr*:v:A c Y<:c:rr.': apk rvLTLncNr^ELNF ic* jta rrnocA i 
7TMiL7i-::;a :rrf .rKhWf.A rDTTAAru'^ r A t Ao:;':3t-:i^Auy;DtTFECtfrwKCAS 
;:.'.vnTWi::iNi';wrNAK tvoi.RA::o^fmYrYDi- [tt,'; :TAA(.;;oAiiguir;PDLACNPA 
:r r/Fr^ :EKl^:^v^FA:\lv\DNLK:r^ I'joPLTr^v /;oi^XKr:'.vrt.vAK::F30::pnrrrLL 
Ml jat rrn .KrAtx ; it i nni.vi >iVP:;LKi rrKKATi .ka roAJC/r/TtJ**:::u;LVDP.'i;r;NVYED 

7:.-WMni V/r:XXrt.TAtHi|-,\N IK ITDUWDPLKKMr IHWr/'yiNWAUWOECrrATKCKAA 

ri .vtfrri* iynpn ( -kp nt m .vanti .wt ::;rvnvn : : rooi-vATr/nn: vktrg rwcEn t 3NFF 

HKIjrrTK It IK. :R<1 1 1 ::Ai :YVV( :Arrri ^V;DNLlT*\AK^*OLf•';KDI^UMF 1 NKNHAGAYAA5L 

1 II jui iiATt :i .1 .i*YM t :: :i .;:EorvLi- PAo r -Y i Y:;Km?4KTY\-(xv\rKCE;::>/VNa;c 
Ai .Kl /j;: :i a irrAt^ :iui :i j-mayfit [ kwa: :y mtM- iFKKPrriTL.vi;: rporvjOLlMVCVp 
I* : rrKKi'K::i.MKi'A:;YiATv I vvAt>wr<KMi*i x -t-rAi.{.( rirfr:.-WKrnrrNL.':aoAt; tCHA 



CPn 044 (J • u-" WR76tj IJ ^<>:)<i-'»»* ? 

•yxTc Bs 2 Wyprta<neciCdt ''Pirotjein a— 

FtOPSRREIHEWKC::.U;33LRMEM4SPF0OPE0CHF0VVC3FLRPE3LTRARSOFEBCR 
rVYEOMRVVTDAA: PJJLIKKOTEACLI FFTTCEFRRYSWDFDFMWCFHCVTORRDSNDPE 
tCVYUCDK I SVSIWPF I EHFEnncr FEKGNAKAKCTt PSPSOFFHEM r FAPNLKNTRKFY 
PTNOEL I ODIVrrrPOVICCLYAAGCRNLOLCCCAWCRLLO I RAPSV^T^SHDRLOEIL 



CPn.0449 507231 505330 

pnp 10*PMP«10 f Frame-Shi fc with 0451) 

EAYTGFRCCOC ISF3^WIV0Cr^ACNCCyVIS I LJV^SECSLSAEACDrrrNG^ 

TTKRNSIOIGSTAKrrNLRAISCHSrFFYDPlTAOTAAOSTiniJlU«^^ 

rVFSGEKI^EDEAT/AIXiLTSTUCOPNnXTAGtn-VIJCRGVTLtyriC 

TTUCASTEEVTLTSLS I PVDSICECKKWI AASAASKKVALSCPIUJJDrCCN^ 

CKTODFSFVOLSAl^ATTnTV^AVPWATPTHVWQCTrWC^^ 

WTNTUYLPNPEROCPLVPNSLbCSFSOXOAIOCVXERSALTirSORCFWAACVANFLDKO 

KXCEiaurfRHKSCG-MrOGAAOTCSEI«.ISFAFCOLPGSDKDFLWW«Tt^ 

HITECSGFICa^KLPGSWSHKPL\nL£COLAYSHVSrroLKTKYTAYPEV«CSVa«AFN 

MMtXyVSSHSYPEYLHCFtnTAPYIKt^TYIRCOSFSEKGTECRSFODSNlJWI^ 

KFEKFStX:NOFSYCLTLSYVPDLIRNDPKCTTALVISGASWETYANNlJ«OALQVW«^ 

YAFSPMFEVLGOr/FEVRCSSRIYNVDUSCKFOF 

CPIU0450 508121 507190 

prop 10 -Polymorphic Outer Membrane Procein 

SGFWCSOFSWLVt^STLWrFTSCSTVFAATAEWIGPSDSFDCSTTmrrYTPKNrTTC 

TLTGDITWNLCDSAALTKGCFSOTTESLSFAOCGYSl^FLNIKSSABCAAt^TrO^ 

SLTGFSSLTFlJUtfSSVITTPSGKCyiVKCOCDLTFDNNCTILFKODYCEEMXAISTK^ 

SIJCNSTGSISFEC>OCSSATGICKOGAZCATGTVDITTINTAFTLFSNNZA£AAC^ 

CTITGOTSLVFSEJiSVTATACNGGALSGnADVTISCNOSVTFSCSNOAVA 

LA5GOC3GVSPFLTI 

CPrV-0451 50815B 511058 

pap 10-mP 10 (Frame-Shi Cc wich 0451) 

KTORWIKIUJSCFVrFm.IYLFCFYIDANSSLKNICSITMa'SIFWLVSSVLAFSC^ 

SIJ^NEELl^PDDSFrOJIDSGTFTPICTSATTYSLTGDVrFYEPCKCTPl.S0SCJ KCfl'i W 

LTriJGNGHSLTrCFIDACnTOGAAASTTAWKm.TFSCFSLLSFPSSP S 1 ' 1 T l OO GTI^S 

ACGVNLENIRKLVVACNFSTArxX;AIKGASFLLTCT5CDAIJ'SNNS 

RIAKmt7f(^RFLSNZASTS0GAID0SCT5IL5N^^CFLYFEratAAIC^^ 

ELI ISNNKTLIFASNVA^'50GAIHAKKIJU^SGCFTEFLR^I^nr5SATPKOGAI5IDASC 

ELSLSAETCWITFVHNT Li ' VlUi> I Ul PKimAINICSNGKFTELRAAKNHTrrrYDPrrSE 

GTSSOVUCXfOCSACALNFYQGTILFSGCn.TAS£UCVAaaJCS5FTO 

KGVTtXSTSFSOEACSU^KDSGTTLSTTACSXTXTNLGINVSSU^LIQ 

KWSCKLNLXOIEOlIYESHMFSHDOLFSLLKITNmADVDTWDXSSLIPWAir^ 

YCFQGOWNVNWTOTATNTKEATATVmCTGFVPSPERKSALVCNTtA^^ 

EXGATGMEHKOGFWSSmWlJUCKroamKCFRHTSOCYVIOCSAHrPKM 

LFAWJKICFIAHNNSRTYOGTLFFKHSKrWPONYLRLGRAKrSESAIEKrPREIPL^ 

VQVSFSHSDNRMErTHYTSLPESBGSWSNECLAGGXGlJJLPFVLSNPHPLncmi^^ 

^^VY^WNSFFESSSlERC^SIGRU,Nl^XP^raAKFVQCOICDSVT«)l.^ 

PQSTATLVMSPOSWKXRGGNLSROAFLUUSSNNYVYNSNCEIJCHyAKEUtSSSRNy^ 

VCTKLRF 

CPru0452 511304 512860 

cxnp 12 -Polymorphic Oucer Membrane Procein (cruncactd) 

FNEirKriLRNFLTCSALriJUJ>AAAO\AnaHESDCYNCAI»mKStXPKITCVP 

FXjmVRISNVKHOOEDAGVFXfniSGNU'FMCaraCNFTFHNI^^ 

LSNFSYtJkFTSAPLLPOGOGAIYSLGSVMlENSEEVTFCGNYSSWSGAAXYTPmXKI» 

SRPSVNLSC^mYLV^^UWS0CYOGAISTHNLTLTTRCPSCFEN^^HAyHOVNS^ 

APOCWISISVKSCDLXntCOTASODCm'IHNSXHLOSGAOFKNLRAVSESCWYOPISH 

SESHKITDLVINAFSCKETmrriSFSCLCLDDHEVCAENLTSTILOOVTIAEXm^D 

CVTLOLHSFKOEASSTLTOSPCrnXCSGDARVONIJiXLXEDTraJFVPVRII^ 

SLEKLKVAFEAYWSVYDFPOnCEAFT I PLLEIXGPSFDSLLLCCrrLER lUVi'l LNDAVR 

GFWSLSWEEYPPSLDKDRRXTPTKKTVFLTWNPEXTSTP 

CPn_0453 -513156 516152 

pnp 13 -Polymorphic Oucer Membrane Procein 

^K:^£LYLFFYSE^L:CRI rWFHLYVOMKTS IRKFLISTTLAPCFASTAFTVEVIMPSENF 

DGSSGK IFPYTTLSOPRGTLCI FSGDLYI ANLDNAISRTSSSCFSNRACALOXLCXGGVT 

S FLNI RSSADCAAISSVITCNPELCPLSFSCFSOMI FE»iCESLTSDTSASNVI PHASAIY 

ATTPMLFrhnmSILFOYNRSAaFGAAIRGTSITIENTKKSLLFNGMGSISNCGALTCSAA 

INLINNSAPVIFST^UTCXYOGAIYLTOGSMLTSGNLSGVLFV^ff^SSRSaGAIYANGNVT 

FShWSDLTFOWn'AaPONSLPAPTPPPTPPAVTPLLCYGGAIPCTPPATPPPTCVSLTIS 

CENSVTFLENIASECCCAL^'GKKISIDSNKSTirLCNTACKCGAIAIPESCELSLSANOG 

Dl LFNKNLS tTSC7?TRNS IHFGKDAKFATLCATOGYTLYFYDP ITSDDLSAASAAATW 

WP)CA5AIX;AYSCT:vrSGEn.TATEAATPAfUTSTLWKtXt£CCTLALRNCATLN^ 

FTODEKSWIMDAJrrLArm^ANWTDCAITI/lKLVINLDSLarrK^ 

IGGTLCL\TO«0CC:DWGMFNKDLCX)WILELKATSm^nTrDFSLGTNCYQOSPYC^ 

GTWEFT IiyiTrHT\TCNWKKTGYLPHPERlAPLI PNSLWANVIDUIAVSOASAAIXIEW 

GKOt^ITCITNFFHANHTGPARSYRHMGGGYLINTYTRrTPDAALSLCFCOLFTKSKDYL 

VCHGHS^A^rFAT\T^^IITK5LFC3SRFFSGGTSRVTYSRSNEKyla'S^TKLPKGRCSWSN 

NCWLCELBCNLPITIoSRILNLKOI I PFVKAEVAYATHGGIOEtfrPBCRIFCHGHLLWVA 

VPVOVRFGKNSHNKFOFYTI IVAYAPOVYRHNPDCOrrrLPINCATWTS ICNNLTRSTLLV 

OASSHTSVNrr/LEZrGHCCCDlRRTSRQrTLOICSKLftF 

Cf>n_0454 516179 51911'- 

pmp_M •Polymc;-.\hi': Oncer Membran* Procein 

CM r'L3Fic::r:r;FCLi.v:u:3AiCAFAETR lgq ir/ppmiocEEi tXT.-rpFvc: wflcasf 
ntir.r wr^i;r.tiLr.LLr.i:i n^'LTFTCCOArTR'rrrMLLSAAETLTFKNFi:.': iNFTc^Jorn^t- 

< SCL t VJKD I VFO.-J IKI^L I FTTNRVAY3PArr/rT3ATPA ITTVTTnA:':ALOI'PD: M.TVDI I 

rjx: r KFPCNi ANF^:.-A I r; J.*? FTAWKP iNNTATM3F3miFTSS0cr:v t va:::::txFErws 
(v: [ [ftan::o/n:-: •;s:;*nri':TrrrAuvr/r,Ai^.iPV7rfEtM^ 

A I YAE^.-NI ;rK\V-\:,LLD::>rrAAPNf> :A f.Ar/LNIOCftCP r EF.':RNRAhUC( I F t';p 

: viwAKtyr.Ti.T : : jv;;p?.:pi Af-v^iMim-Kr-^: i rna tr/EAa;E i \/::L:XAut x ;::ulvfv 
Of mi.':t.m*::i'.sNKDiT tNAi*yv:>;;:wFTr:KOLS!n'ELU.PAfrrTTiLiir^ 

IX tTUIAWrryf ^:rA ::\VI TI r,:U y ;rLf;LATFTrjVPAAVOFT TKLAFDPFrtKLKnD 

i-v:;A:r/MA* rrKmT:.Ti ;AiAtxiHii iVTiM.Yfir/nuj::i''/AC piavfkv v\ tvpkt» \v\-tr,\i 
r A Tpr;! iyi :yf/;KWATwr;n rt.i, t papi/ y ;Ff ^rjp.iPSAr itlyavwn: uti'lvr: .t\' i i j^'E 
uY^Ei7:»<::LWL:r- u*K>AK::iJi wiiVLiauii-cuT [TAKJUjGAyvi3rrr(<L' :nvi:^^^^^ 
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vrv ;VOAAU :MI rrrDHTTLC:-;p;OLVCKTNAM F . J ^ EOMYLLSFFCOF P IVTOKSEA 

W0UZ:F r.'?AKK0OCW0GKFTCT^DU}R3F3RCKGYNVSLP rCCoSOWFTPrKKAPSTLTt 
K[ J^YKPn I YRVNPHN t 'nV/.-^NOESTS laCANLPRHGLFVO EHDV/OLTECrTOAFLNYTF 

:x:KNCFTNHRv:rr:LK.TrF 

'I*n„0155 S20J*;3 S19459 

No romisc nnmutog oresenc in C«netjank/ OfflL as o£ 11/7/98 

• * -V.-, ; ..:...:;.-.:,r:-:;'. r^i irr rRr:r.;fA """PEKEAacav 

AJO;;KWAFDDh:HLPWV3;;HXAVA£E:IREK0E0T^^X^SLT£EQLi;ALLC^^^VSTEKNl^ 
ALDAVI KOSVWR FRNPOLFAYEREALEAS VTDALVSYVSNLDKI PYTSSOG IVI ECSSrV 
RTSOEHTLIVNCAAFDKLASOIEFLCPSDVLPISCKDPLISDDEDEELNPKVSSAADSKD 
KT 

CPn_0456 52 156 8 52Q327 

No roousc homolcxj presenc in Cenebank/EKBL as ot 11/7/98 

I PCrrESKRKFUmiCUiCWFSVVRHHrVOAnJFSRPLYSR ITHFALCVIKAI PIVGHLV 

MCVr»«.ISHCFERCVSHPCFPSDrAPrUCVEKIAGRDHISRiaroUCSLRKTIEVEDLDK 

VHCOYOENPYADMASSEVLKLOKCVHVSELGKAFSRVRNRITRSYSYAPTPOLDSIAI^ 

IOLVSPEEOENLVRIJKNEVrOLVPKSKTTLYLI.IDrNKEWVGDISSOKEKOLRSLGLHSE 

'/CCLSVLEPOCAECEDTKHFTJLhnAXrYCKDSYLRBCKII^AlJCTrSLCrr^ 

SRYRSRI^LPI^^rEKDKTELYKEISRTHHOU{TLGMGLCAODSCmJaORU^ 

HCHSYUUDLTHEELKILLFSArVDAKNISKKELREVSLWrANDTSVECGCAFVF 

CPn_0457 523886 522120 

No robusc homolog presenc in Genebank/EJIBL as of 11/7/98 
VTLPSRVMASCl^AWFSIVREHFYRAFDFSLPFCAR ITEFVLGVIKCI PVVGHI IVGIEW 
LVSRYLESFVTKPTrVSDVVSLUCTElCVAGRDHIAR\A;rriJCRORVAVAPEDEDKVHCKI 
PVHPrGGIQPVEVLTLYPEVODATUSLAFSKI RNRVROAYLOAPRPKLOKIY I IGNEMNP 
rr/DOrUftJUU-CNCTORLYPDATISLYLTASCKRNAMDKXKRKIiSIXrEL^ 
^K^DVVK0ATCIX:WKVnrHCENO0GTL^raIOEEI£KSGEETFW 

sslemkctkekau:yselekeolysrlvyvgerssvi-slcfcdsrsgilmdpkrv^ 

echychsyu^lenpgujktitjufu^pkelsstilopisl^ilnsktylrohr^^ 

msrsdrnvwwcdswwctdwkeepsfqhfimelecrgyshfnifafrsnsmcvra 

NESSQEKAFTMI FCEDSVSWDIRCUflASEGMUXSKECYAVUVnrrSOCANFJlMEEVLTL 
ERESNLWNRKHCLWKREVRKQKOEAALDQDESEIYVCNOLTAQQNFACS 

CPn_04S8 526344 524236 

No robust homo log presenc in GencbanJc/DfflL as of 11/7/98 

YFLCCYLKLFVS^fFI^FVVMPI PYISSWI STVROHFVKArorSRPFCSRVTNFALGVIKA 

IPIVCHrVMGMEWLVSSCVACIITRSSPrSUVA^IVKTEKALGRDHISRV^ 

ITPENODKVHGKFPVCPFGRLXSEETUCLKPGEItBCTriJm^rSPIRTRVTR^ 

IRTISIVGSKlJCTPQDFSOFVSrJWJETORUiPEALVCLYLTCLNRESQKCmTAEXTO 

U«SCnJ3SRIQCKDSKEDOAGSPEJn»Er*riCYYSREQOHNriXX7YIOCXLCKSADPtIWI 

HVTEOTKDnnrPPNrrSYSHTRQSTDPTSPPRU»ESBGDKDSLYGQLSRSYHHEVHLCLC 

LKPEDACUJCPDRIYAPLSQCHYCHSYLADIEl^LRTLVLSPFl^PGNLSSEDUlPVA 

FNIARLPIXUDSLFTRLVAGOQECRNIVTLAHGTPRPEDLDPDSMNILTRRLOMSCYSYL 

MIFSYKSRKMIVKEROFFGDRSBGKSFTLIIJ-EDPISAADFRCLOLJU^ECKVAKDLPSVA 

DICASGCSCIOFSQtOSPOAIEYRQWEARVEDEACEEAREPVIYSQOOLSSMLTTQWFV 

FSUlAVVKOAIWRfTlSKCLLTMERKALGEEFLTAIFSYLGSOERNENWianT^ 

SFEELDRMVQVLPAEVPADSGNDPTRPVPNPDSNPDSSONBGS 

CPn_0459 527062 526619 

No robusc homolog presenc in Genebank/EMBL as of 11/7/98 
STKlOMHPGLRNWRTSTNKLREECSVSFREYFRAYMCDKrVAOKNFLFTLDAVIKOAGWR 
SOEKUnTYVESOArXJRErKVStXEYIOSMVCILCSORTKKSFKTSVDFTPLEOALOERC 
SSDDDEOATATSTATGATASPTDMHEDE 

CPn.O460 527840 526992 

No robusc homolog presenc in Genebanlc/EMBL as of 11/7/98 

VIOHIXNFALEETPSISVOYOEOEKI^PCDHSPEIGKKKRWNKLESFSTYCSLFMSVKDH 

YKOILGIONSl^GWlXDPYRVCAPl^SPYSCPSYIiDWNKEUUlSlXSTFLOPia^LTS 

TFRSVS INFCNSSFG0RWSEFX5RVLHDEKEJCHVAVVCNDAKLLEECLSPEAtSLLEEDL 

R£OTSYLNILSVSPECVSKVOEROrLJUU3LOGRSFTW4ITDLPLCSEDIRSLOLASDRI 

LVSSSLDAADACASGCKVLVYEUPNASWAOELmrYKQVERRR 

CPn_0461 528647 527944 

NO roousc homolog presenc in Genebank/EMBL as of 11/7/98 

rS IVACPS ISSWFTVVROH FVNAFDFTHPVCSRITNFALG I IKA I PVLCH IVMCIEWLIS 

W I PRHTVRHGM FT5DVS SA I KVEOTRGHNCLAPLEAYLSS LR VP r SOEDLGKVHGRTPED 

PP/DrTPTEIVQLLPDEELSTVDEALOGVRSRLTYAYRSVEKPMIODLALVGFGLRDSAO 

LI^^^VRLA^K;V0NHYPHTKVKLYtAKNIJ^3VWIXEISK 

ACLPSVPEVATVDFMITCYCKDOEVQDP 

CPn_0462 531124 529037 

No roousc homolog present in Genebank/EMBL as of 11/7/98 
LI FYLFLNLY lACVRFHFOCWFDPMACY I S IWI STVKOH F r RAFDFTRPLGSR ITNFALC 
V I KAI P I UJCWrCVSWLVSTCSARRFGKPAFTSDVAS I VK I EKTRCYNPLAWVEOYLRO 
LRVRLPBCDLCK I HCKVS RDYVCDRTPOENLNMVPHOYLGELCRAFYC I RNRVTKAYORV 
TPLEVPCLTLVCFDI LDPEOOVWFVRLANG lOTQYPQTO IKLYLIS lOK IWN0CC3CTI SO 
EKEOOLRSLCLDAK IKCVSAPALLLOKYLOSENLPSCDLL INYYGKQOSVRDVDS IKSLL 
NLS3EH r PAISVTYRPDDPFYSYYFFPGSOGCTAPDORI PWSEOEHUTmTLSNPRCOR 
YAVHLCMEDFA3C/FLDPLRVSAPLSCEYSCPSYLLDLKSEELRCFLLSAFIDPNNSC0C 
NPRPMSINFCNSPLGORWSEFLSRVLHDETEKHVAWCNNPOLIKKSFPSHSLSLLENEL 
EECCrrSYLN [VSVSOERTCVKERR I LSSOPSCRSFTVr LTDLPECSSDIRNLOLASDRI L 
7r;CAL0AA0ACAS ECK E LEYEDPEQEWAOOYASFYRN IDRAGDLOROC I PCEPLCVSAST 
RVyLEKOiyFNLNAVICX:AMWKFKKRDLFAVESOALCDDMRRALECYlCSSLLVECTtOP 
OVAf:NVNVr;FATLDEAVC.\ACDSAOOAPSEENNTDD 

.•ti._fj>l',3 •;.124ao 511 1? 1 

tut mUtut. tuwviiiH} present in Ceneb-ink/EMOL at: ot ll/7/9« 

: . :: :r*YEKTr^n,u:rrPNCRTFB VN t :rrvc t p i OETcriAFvosMMKCx :vcoDAr.ELYTFLSR 

■:rir3IY0li:iwr\':i.I.;EEU;FLFDEKMLCArL£:EDHY':[CYLVDLVtX)IILKDLrL3HFLDPQ 
ti I : W;b:LLKV:: I ttV* :I'::F: -PLOOKDFLJWLROETriKNWWFWTVLGLPATOVCKLVEE 
I JJ::KDY: :YI.N r F; TA K;PS:'.FOI.LFRKEr .ECTCGRYF-rv rCALYli ;[rrCMR:;L0LA3ER IM 
Vr:kKFUI .VlW/.Wr'i :Kt .LK l OHTNWR PCTFr;RHADFADAVOV:;Ai>'NSRErKLIT0AN0C 
I ;i:t J^r:;mVfi:!M .AFCDRVrmilF t PMLDAA I KOAWTMKMr:;L I DKECEALD 
I .rrv :i .l-:: t V: :VI .i:YVTN::iiKKT::KOf*F [OKE 1 1 A£y:r;PLKEALFtX;:iDEDVPST.':EDPS 

i/iJif-r.rji.Ku; 



■7Pn.04*i4 S 
No roousc notnnloa or -sent m o«n*:D.inr. tMflL 



.':LCTRCRn'E tCl^Ui Pf ^naUCHUJl^ETTTACItU^RrC FAPLWBI^T^ 
HRIAIWICVU)SE33KtLB?LISYMSGiT3EoCtm-RP^K>fV»J<^^^ 
IRCCFFSEDAVPESEPFDLJ : VVHTDRSf:PLPTKKR3;;3WELCr(.'£L?£S lYPOSEFUX 
APRMLo 



CPn_04S5 

Aw4«t I i Jwfc< l"*"^i''-••^*' 
RDECKVHCCLPSAPFF 



CPn_04«6 533713 536537 

pmp_l 5 -Polymorphic Outer Memorane Procein 

TSMRrFCFCMU.PFTFVLANBCLOLPLETyrTLSPEYOAAP0VCFTKN0NODtAIVta«N 

OFILDYKYYRSNOGALTCKNU. I SENICN\^FEKNVCPNSGGAX YAAONCTI SKNO^ 

Trrn^VSCWPTATACSLLCCALFA r NCS ITNNLGCXTTFVDNLALWCGGALYT^ IKDN 

KCP^IIK0NRAL^^SDSLSCCIYSGN5 UfIEG^lS GAI0^rSNSSCSCGGIFSTC^LTISSN 

KKLrElSENSAFANWCSNFNPGGCCLTTTrCTIIIJNRECVLFmiOSOS 

rrKENGPVYFU4NTATR(XALLNLSACSGrCSFILSAaCDr:F»itOTASK^^ 

AIHSTPNKNU3 ICARPGYRVLFYDPI EHELPS5FP: LFNTErrGHTOTTLFSGEHV^^ 

DE^WFFSYLR^f^SEUlOGVIA\ra;ACLACYKF^CRGCTLUXXKAVITTACTIiTPSCT 

P^TVGSTITL^WIArDLPSILSFOAOAPKrwrYPTKTGSTYTEDSNPrrTISC^LTLR^IS 

NNEDPYDSLDLSHSLEKVPLLYIVDVAAOK INSSOLDLSTLNSGEHYGYQCIWSTWVET 

TTirNPTSUX^ANTKHKU-YAWSPLCYRPHPERRCEFITNAUJOSAYTALAC^ 

0£EKGHAASLOGICLLVHOKDKNGFKCFRSH>rrcYSArrEATSSOSPIffSLSFAOFPSKA 

KEHESONSTSSHHYFSGMC I£>nXFKEWIRLSVStAYMFTSEHTHTKYOGLLBCNSOCSF 

HN>mACALSCVrLP0PKCESL0IYPFITALAIRGNLAAF0ESC0HAR£FSLHRPLTOS^ 

UMSIRASWKNHHRVPLVWLTEISYRSTLYRODPOJISKLLISOCTWTTQATPVTYK^US 

IKVKNTMCJVFPKVTLSLDVSADISSSTLSHYLNVASRHRF 

CPn_0467 536528 539434 

pcip_l 6 -Polymorphic Oucer Membrane Procein 

^rcILTISD0^mKrKXPLVSKTPPKFL^YLG^ffTACHFa^PAVYSLaTOSLEKFALERDE 

£FPTSFPLLDSI-STLTCFSPITTFVGrniHNSSODIVl.SNYKSIDNILLLWrSAOGAVSCN 

NFlX5NVEDHAFFSKNLAIGTGGAIACCXWCrrrKNRCPLIFFSNRGU««S^^ 

AIAOCTrriSCNOCTFVFVNNSVNNWGCALSTNGHCRIOSNRAPLLFFNWAPSC^^ 

RSEIirriSian'RPrYFKNNCCNNOGAIQTSVTVAIKNNSGSVrFM^ 

GCAIYTm^IDDNPGTILFNNNYClRZXXy^tCTOFLTIKNSGHVYFTI®OT 

LODSTCUJ'AEOGNIAPONNEVFLTTPCTYNArHCTPNSNLOUUNKGYW 

HPTTNPLIFNPNANHOCTILFSSAYIPEASIT^EIOJFISSSKNTSELRNCVLSIEDIUC^ 

FYKFTOKGCIUCLCHAASIATTANSETPSTSVGSQVI INNLAINLPSILAKCKAPTUklR 

PLQSSAPFTEiyiNPTITLSGPLTLLME ENRDP YDSIDLSEPLQNIHLLSLSPVrWHIKT 

DNFHPESI2WTEKyGY0GIWSPYWVtTITTlTJNASIETA*rrLyPALyAWrPtCnCVW 

yOGOLATTPLHOSFKrMrSLLRSYNRTCDSDI ERPFLEIOG I ADCLFVHaNSIFGAPCFR 

laSTCySLOASSroLHOKISLGFAOFrTRTKEICSSNNVSAHNWSSLYVELPWrOSAr 

ATSTVLAYCYCDKHUiSUHPSKQBOAECTCYSHTIJ^AAIGCSFPWOOKSYUa^PFW 

AIRSHQTAFEEICCNPRKFVSOKPFYNLTLPLGICJGKWOSKFHVPTEWrLELSVOPVlYO 

ONPQICVTLIASCCSWDILCHNYVRNAJjCYKVHNQTALFRSLDLFUJYOGSVSSSTSM 

LOACSTLKF 

CPn.046e 539608 540432 

poip„17 -Polymorphic Oucer Membrane Procein 

lYKLLDNKLMIFYDKLYFH IKVVIMFMRPICLS ILSTALCCSLSGNEVPNLASCQKSIWDI 

SATHTSPSFRI/An'PEPLVSSniPSNLLNCFGHDtTOOITITCNSINSVIDYNYHyEDOC 

ILACKNLFISDnCCNLSFERKSSKSSOGALYSVRECWrSKNONYSFrSNAASlATTPTSC 

FCGAIHALDSYrTNJrtiGBCCFLaJVSraraOGAIYVGVSLSITDOTAlPtVIiaO^^ 

FCSOGIFCRAVNIERNYQNIOINDNASCOGWyFLP 

CPa.0469 540399 541460 

prop_17 -Polymorphic Oucer Membrane Protein (Frame-«hi£t with 
0469) 

CFRTRGCirSALGVIISSNKEriErSNKSASSI>n'ASGKLYPCG0CIMCrr^VIE3WPKC 
LI FNNKTAALSGGAIHTRSFI FQNNGPTAF I NNSATSGGALINLSC IGSTPONFTLSADY 
CDILFWM'rTSSSPOPCYRhULYAAPGINLKLCAROGYKILFYDPrDHOOTTrDPIVFN 
YEPHHLCTVLFSCirAnDSNATNPLWFLSKFSNSSRLERGVLAIEDRAAISCKTLSOTOCI 
LRUaJAALIRTKCPCSS INFUAI AINLPS ILOSEASAPKFWI YPTLTCSTVSEDTSSTIT 
LSCPLTFLNDENENPYDSLDLSEPRKDIPPPLPPRCCXrKKNRYFESHCRSHEUl 

CPn_0470 541357 542532 

pmp_l 7 -Polymorphic Oucer Membrane Procein fFrame-shiCc with 
0470) 

rSLmXRISPLLYLLDVTAKKI OTSNLIVEAM^^EHYGYCX; IWSPYVIMCTTTTrSSTVP 
E0TNTNHR0LYVIWTP\A3YRPNPERHGEFIANTLW0SAYNALLCIR ILPPONUCEHOLCA 
SLOGUSLLINOHNREGRKCFRNHTTCYAATTSAiCTAARHSFSLGFAOMFSKTREROSPST 
TSSHNYFACLRFD3LLFRDF r STCLSLCYSYGDHHMLCHYTEI LKGSSKAFFNNHTLVAS 
LDCTFLPAR ITRTLELOPF I SA lALRCSOASFOETCDH r RKFHPKHPLTDLSSPIGmSE 
WKTSHH r PMLWTTEISYVPTLYRKNPEHFTTLLrSNCTWTTOATPVSYNSVAAKIKNTSO 
LFSRVTLSLOYSA0VSSSTVC0YLKAE3HCTF 

CPn.047l 54^561 545401 

pmp-lB-Polymorpnic Oucer Membrane Procein 

WON^mSL»*KSSFF^,X;AL I LJKTT r LLNATPLSDYFDfKjAWLTTLFPLIDTLThBfmS 
HRATLFCVRDDTNOO I VLDHONS I ESWFENFSODOGALSCKSUVITNTKNOILFLNSFAI 
KRAGAMYVNGNFOLSErmCS 1 1 FSGNLSFPNASNFAOTCTCCA\rt£SKNVTI SKNOCT 
F I NNKAKSSGCA tOAA I IM IKDNTCPCLFFNNAACGTACGALFANACR I ENNSOPIYFLN 
NOSCLOCA I RVfrOEi: I LTKNTC.T/I FN^J^^FAMEADI SANHSSCCA X YC tSCS IKDNPCI A 
AFONNTAAROCGAICTOrJ LT lOOSOPV/FTTJNOGTWOCA tMLROOnACTLFAOOCOI tFY 
NNRHFKOTFl^N) IVs-\'NCTnN\.*.-:LTVr.A.':OCI (.':ATFY0P I LORYT fON*; lOKFNPNPEHLC 
TI LFS:m* C PDT.'TSRDDF li'HFftNH lOLYNCrTLALE&RAEWKVYKFDyFtXrrLRLCSRA 

vFrTTni- F^\:: :v\ ::;v t n t nnla i Nur, r u inrvapklw i r pttc^^-apy.iedwnp [ inl 

:*/:r'f j:r.LDDENI J)n'iyrADLAc'J' I AEVPLLY1.1.0VTAKH IWrDNFYPRnUTTPOiryOYQC 

vw::f 'Yw r pt irT::rT;i';EOTV'NTij(Rf,w.vi iDwrmiYKVNPENKnD r au':afwoi;phnlf 
ATU»YO'r(.V t Ai TA.*.-' ir^vrr*! .fvi ujur.NHi jakofi fMEATGY.iri rrr: :nta:m ispttvn 
Fr;oLF:;Nt,YK::it::r^jvA::ii*rTr/AU; rNMiwi<;n RF:rr,';ASi-\Y::Y::NHii i KA;x,'Y:y;K 
ic/rRGK(:v::'prL» •«\Auy:::u:;-.vwR::ftit.nrrrF roA r AVRj;NyTAFoti'r:oKAMKF.':wr 

I Al'^iA( AKk% ;rrfJO f F (FIku vki.oyi> :;;v::;::rnTMYUiAcrrFKi- 
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No r»ii«i::t (tomolocj prpsenc tn i;enet. /DfBL 4S of 11/7/99 
FVFMA:;G;f;^:;:;r;LC}CIPPKDNr:DRSR.':P3PKf:ELCSHEI3LPP0EHGEECASCSSHIHS 
Jj:;FLPEDOE::0:JS.'33AAS3PCFr3RVRSCVDPALKSrf:NFFSAESTCOARETR0AfVRL 
.>KTITADEflnDV0SSSAAATEARVAEDA3VSC£WPS0CVPKT3SCPEP0RLFSLPSVKX0 
;K;a:RLVGr/RDRrVLPSCAPPTDSEPL3LYELNLRLSSLROEi:3DrOSNDOLTPEEKAE 
ATVTIOOL :0 rTEF0CCYMEAT0GSVSLAEARFKGVerSDEINSLC3£LTO?EL0EL«SO 
•".DCLONtXDETADDLEAALSKTRLSFSLDDNPTP I DNNPTLI SOEEP lYEEICGAADPOR 
THENWSTRLWNO t REALVSLIOI I '^llCSlUiRLP, lARHAAAEAVGRCCTCRCEECTSS 
: - :-:*•:^- Mrvr;'*-;: *:v^t:;A:."";wA!;='::':A*^T^-rt.*;T- " 

"■ r.:.-- • ---rw; :v:"- :-.-M::rji;i! v-t»::-u;:v:-vr :;:;'pw:-pAp-r.!:zc>v 
KCOYtVP ITJAEP3KDKN : YKTPRLATPAIYOLf'GR WSSCSi-RJPGSC/RVRSSoPNRRC 
VPLPPVPSPAMSEBGSIYEDMSCASCACESDYEDMSRSPSPRCDLOEPIYAfn'PEDNPFT 
ORNrORrUJERSGCASASPVEPrYDEIPWIHCRPPATLPRPENTLTNVSLRVSPGFGPEV 
RAALLSEWSAVMVEAESIVPPTEPCKESEVLEPLOGLVATTKIUjOJCCWPRGESNA 

CPn_0473 549602 548070 

No robust horaolog preienc in Genetsanlc/EMBL as oC 11/7/98 

GS IHAVCGVCGSRSPS PI p prmRNSEDGKVSPKDhfLGEHTVSSSDSSLASOGPTIEERKA 

OtCGTOXIPLPSVKEPGDSOTSCRSGVLORIWKCVKCWFKKTPOARPEVSSPRLPSHVOH 

G0RLPCL£CnU9RIOKR5ENPEADLCKMKRSYSDCOU>RVCHO5NC£)5T£D5R5BGGEPS 

3KSSSFLSG\mCAVSKVHGALCDrKGKr0RSASEODLTT0CEDSACDrrVKERRSEEAEAS 

SKSSSFI^^CATSTVOGALGDAKEKVSAFGEOAAGAIRSAPGNIRTRFORSSSECDI-S 

>WNKAAKHLRKALEM.E}a;APEaVSPEVASRVQSUJ«MEOLTHOEPPTVEDLITFVESN 

'/GSOSVEYAS rVPODGSQAPAETAEAPETGGVECSAAOCAWKALRDFWS IFOAVASFFR 

AIASRLSSARRESAVDDU^ESNT(>nn/EOEGVSNPSAAPSLSFAEEIAIUlfcA£«SNRN 

OSLEKLESGNVTDPVIQOGLGI-ARSFAPECO 

CPn_0474 551600 549807 

CT365 hypochecical protein 

LKI^ISISFMSTSPIS^roPRYl.Sl^NATEKTS^J^NSRSl^PVPNSLVPSf^PED^CLRKS 

r FTHSVTLFAGLWLLVAVSWWALTVLAPGVPOAItXG lA ISGVG IGGFS IMKSLVYM 

VRDyMSPRMQESSRIKSALAVGTCFTVWLVMKVGANFVPa;yra\roSLX;S 

r^L^SFSKYIYTKFFRSEKVAKC£KLTEACTIKEAKKIi^fITI^IATICWCIAVlfiIL^ 

lAGTVLLSCAPATIAI riAPPLISIGLTTVt^IUiSSICK&W^TIXTOEKKDLFVDTSL 

KDIRI£KlJ>PSEVEESCTS0WIEVPDSECIAETRISAEErDTRI^LTTROKVIFAIArL 

ILLAS IMF IVTCFGGLTVMOVLLVASVGSAVASVTLPMVSSGFSYVAVOUCARLNISKL 

RWKEAKNKKRVR0FLIESCVIASDR£JN3MWKTVYKK0I0KrDAAIR£EVRNFa 

SALVCGirXGVCTGIMLLALVPAFAPrVPGIU^UXISTLCIAGSII^^ 

LYERWlNRRELLYGPESKMRSIATDLVVEAIJU^HDHrj'OtlCPVDFXDV^ 

CPn.0475 5S38S0 551685 

3lg8-Glucan Branching Ensyina 

?SMVCKLrHPWDLDIiVSGROKDPHKLLCItASEDSSDHI\rtFRPGAHTVAIELLCELHH 

AVAYRSCLFFLSVPKG ICHGDYRWO^CLLAHDPYAFPPLWGEIDSFLFHRCTHYRIYE 

RMGAIPMEVQGISCVLFVLWAPHA0RVS>Ar3DFNFWHGLVNPLRKrSD0GIWELr\^^^ 

EGIRYW^IVTOSGNVrVKTOPYGKSFDPPPOCrrARVADSESYSWSDHRWKERRSKOSBC 

PVriYEVHUISWQMOEGRPI^SEMAHRIJ^CKamYTHVEL^ 

GYYAPTSRYCm*OEFOYr\mYLHKe^ICI ILDWVPGHFPVDAFAIJ^FtXIEPLYEYTCH 

OAUiPHVWrrrFDYSRHEVTNFlXGSALFWUJKMHrDGLRVDAVASMLYKDYC^^ 

PNIYOGKINLESIEFLKHI^JSVIHKEFSGVLTFAEESTAFPGVTKDVDOQGLGFI^^ 

GWMHOTFHYFMKDPHYRKYHOKDLTFSDffArOESFILPLSHDEWKCKGSLVNKIJCD^ 

VmtFAQMRVIXSYQICI.PGKKLLFICGErGOYGEl«PDRPLDWEUM«YHKT^ 

LMALYIHOPYlWMOESSOECFHWVDFHDIDOmAYYRFACSNRSSALLCVKHFSASTFP 

SYVLRCEGVKHCELLIJm)DESFCGSCKCNRAPVVCQlXX;VAWGIX>IELPPLAW 

FT 

CPIU0476 554877 553858 

CT865 hypochecical procein 

GRCRiW>GDCMIOIMQHFKPYTHVPGOKLPIPGSLLYAOVFPTLWRLFSSKHEILNEOT 

LOVQGPLKRFAVFOOLHRGGLAVTSERYKVYl^PSGECTOSIKGKLPSAAOACPLLSLGV 

HKHArWKVRCRiU3UCEILPl>ffTlFAAMAPKCSYRDtrn'AIGSLVKTAH0RVl^ 

lAPAIXSIALACFSECFLPRSYDEEFOGILPODGDPEGCVPFELLSYSFGMIODIFLBHQ 

COLVEILPALPPEFPCCRZ^IHVALPrnjCTLSIVWTKKTIROVELHAEYSGEVFLKFCSSL 

CSARLREWSERRLSCSKRLSLCETLEIKAGTTYLWDCFHK 

CPn_0477 556U2 554B44 

•yqeV_Bs Kypochecical Procein 

RYmVAEVXGTFKLVCUXrRVNOYEVOAYRDQLTILGYOEVLDSEIPADLCIINTCAVTA 

SAlESSGRHAVROU:RO^^PTAHrVVTGCU:ESDKEFFASLDRCXr^LVS^^<E^ 

^HIILl-C'^^"^''^'^^'^^^'^°^5f"CSYCIIPYLRCRSVSRPAEKIIJ^IAG\Aro 

OCYREW r AG INVCDYCOGERSLASLI EOVDRI PGI ERIR ISSI DPDDITEDLHRAITSS 

RHTCPSSHLVUJSGSNS I LKRM^mKYSRCDFLDCVEKFRASDPRYA^TTUVIVGFPCESD 

32^5?Ji£n?^^^*^^^P^^^^**'^*^'^™^IPWVIYERKXYtA^ 

EKMKRlX;ETTEVLVEK\mK3VATGHSPYFEKVSFPV\ArrVAINTLVSVRLDRVEEECLIG 
EIV 

CPn_047Q 557640 556210 

hflX-GTP Binding Procein 

WOTLOrrDTPGEOGSOSFGNSLGARFDLPRKEODPSOALAVASYQNKTDSOV-yEEHLO 
ELISLW3SCCZSVLETRSWILKTPSASTYINVCKLEEIEEILKEFPSIGTt.IIDEEITPS 
S2JIiS2^°^^°^'^^^^^^SSRALTAEANIOVOLAOARYLLPRLWU^ 
SSS™2?^^^^'^°^^°^*^^^'^^"*^t^^''^VIK0RAERRKVKSRRCrPTFALI 
GYTMSGK3TLI^U*TAA0TYVEDKLFATLDPKTRKCVLP0CRKVLLTDTVCF I RKLPHTL 
VAAFKSTLEAAFHEWLUlVVDASHPtALEHVQTTYOLrOELKIEKPR I ITVl/^ 
(JCSrPMKLRLLSPLPVLrSAKTCECIONLLSUfrEIIOEKSLHVTLNFPYTEYCKFTELC 
OA3WASSRYOEDFL\A/EAYLPKEL0KtCFRPFISWFPEDCCDDBGRGPVLESSFCD 

•:pn_0479 S59434 557616 

pMnP-MiiCdl Drtpendenc Hydrolase 

A K^MVRD tO-IES ICKLVFLCTCNPEC t PVPFCSCRVf.-QNTX: t HRLRSSVLIOYOf «TLVT 
l)A(:PDFRTQMLV,WGELaA^FLTKPHYOHICCIO&LRAlA*r\nX?RSLPLVU':A.rrYRFL 
NKAKE'^LFATPNVES.'^ LPAVLEFT I LNEDCOQEEFy; { PXT^-V.TVYCKiCffVTnFRFCNL 

I . r tn I r: :nf :i .EAEft DOf I PCVTFAYDi WEVUTTL 
' .*r '.H I I lyrx )r h( ff i c.i I prop.e i n 

: ihV'M t :;wr: I ; i :i(U;r:Kv.':KKGocx;MU7:;i .rf;Yn :A(;r i [EEYKNnvFYCgLCAEW.TPYW 

IV I WD'/(> lArrTf iriAjVl.nCKOHKhV It.l'VWlPtT.'JtWALEPVCKiIAlXJLEnAMYEU:.': 

'JVUN^••IM^•:: rv:;wvFT.;(:r a: I FACL iVf7VMVEAPLiAr;LJAWv I rt: t ra7A^A £ LCLFA tL 
HAYii :i« ;i'yfaviiii-;i!Ey fTormTRO C0Aii:K?N\*r:vrTEYPA'n*Ati:0f'iTKLPNCsnR 



MO rotmsc no«ylo<x..or*sktnc'U$aJ3eneDanit/'eM3C -jisliifc Hilirai.'lS: 
3CUI tECILMAT3Vp\TS3TT.*CEANSSNER FTERTJRMYYAALVLGAUCl rFIAMIV- 
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LYEYPtSYLWA^ZLXVRGTEr5LED0AO\T.r:-0G:XSMLSOFASRU}SCOKVXOT 
O\n^E0AAVMLVHCI-W0C^^3F<X;UCALKYLTAVP0RMWLGALPLF£SFP^^^W^ 



ESLCD 

CPn_0482 561764 560961 

arcJ-Arginme Peripiasmic Binding Protein 

NLAYlUC7rrMrK0rGRFFRAFIFrHPt^LTSCESK:DRNRIW:\xrTNATYPPrEYVnWG 

PYYCOOTEXKVVSKRSIXrP'/l.PLTQYSSVAVQTGTFOEHYt^PGICVRSF^^ 
tMEVRYCKSPVAVLEPSVCR^VUCDFPNLVATRLELPPECW^'LCCGLCVAKDRPEEICrr 
QOAITDLKSEGViaSLTKJCWLSEVAYE A^wnrM.iwi * 

CPn_0483 561330 564964 

No robust homolcg present in Gehebank/EKBL as of 11/7/98 
I ILIKKRAI FERMFPI PPPHCPPhWKNNFYHLrrorKDPLLLRILRTIGYVU^ 
tXLIHYYKHHRVVWCEGLPTPPTl.PKGPEPKT I EIAKOPPKDGEDKXPDVPKPCTPPPED 
TPPPPPICM»SPASPKWKOPADKKPTPPPEAPPPPVTlVATPMPLRPSSOCVl«CUnWVS 
MVLRRAPLPLPAMOVDPILCDFNPHFVASYPtmXDNEPKVFOIKOFXKIAONPDU>0^ 
Rl^t^LBOALyLNEWYYL\^PGIX:»CnniAYAVGWI^ALYEESSRroW 
DLP^ASSSPA^^A^^/:AQ^AI:LLOl£STYCSF: DLYOCVII^OKHTATLIAFUU^ 
RQOIAASSNEETARALFISDMCDDLLPSVLEFIAANRPYSELFONLIDHSALPYMOSRDK 

VmaJ»EAIRCOYSRnATrEimRSGDLPWSPALSFFAFLCTCPSVRrHKLCATrYlC^ 

DIIIASAPPQRSIQEItCISNASLSYUJEDIXJSSWQREVISSNIKTILrrHESLTLCSSM 

PQLETIia»IANLLKNVrSTSFETPPLSNOPDlXSNLVrO(LLVAIHSKIXtJ^^ 

ARSIJlLTRDE^SGI^OEODtiYTOAVQLLFFrL0HPOV^»mPCTKDAVKEI^^ 

YAFKKVrtliEKKLOKIIJlSrLCSLVUCPPARYPSTPSNKDKCTrCKFW 

EKNO!QFLRATFPNyOLETEAIU.EKEIESTFRNGWNVF^TRUn*FGSKLGSP 

DQrSKSFLIFCFtW*YPKLL0KKTPtJU«UlArOREASHRrrQVKDKL^ 

ATINQYSRAW)OLICNUJamrrASDGFCRSGFROSLICYUiSCSSNn/a)ILD^^ 

EANWAAKTTVPUJPFAVCLIKSDRinVSEailENrVAMHGFLm'ISPERCARrFLIRrP 

NHYOCLLPRNPRTEOONSKPOSSNP 

CPtU.0484 564931 565824 

aroC -Deo xyhepconace Aldo lase 

RSEUCTOUCSLVLHEVLILrrrYPLPRTUCOHPDEVHTVP I SPMLSFGBGSPILIACP^ 
. TlXSVEHrVSSALTVKEACAOVFRCSIRKP!rrSPFSF(X>raCECVU«KEAOSIHC^^ 
TEVlXVRDVEITAEHVDILRrGAXNMHNTPLLOEVSKSHRPI lUOSPAATLEEWLCAAE 
Yir^SPSCPGVri/:ERGIRTFEHSTRYTLDLNTVAU*KEISSLPVIVDPSKAACKRSLV 
LPIASACLSVGAIX;U4IEVHAHPEKALCDAK00ITPEEU(LFAKK^ 

CPn.04e5 565993 566229 

CT382.X hypothetical procein 

OPICRTPTRVrtraFWIKOACKFYU^OCUrALYWLtJCYCRKLLKCTLHHSEETLYW 
SSLIDLLYQLKQLPAPTNE 

CPru0486 557799 566405 

hypochecical proline permease 
AOHRSIXKGNIFHUXrGVLYFMNFSLFLFFLIAIOCICLYVGRRCSKKVEDRESYTI^ 
SUCIFPU^^f^FIATOICGCVLLCyUEEAFCYGYGCILYPU^/ALGLIFLG^CPGKRlJ^ 
SLTTWSIFWTYGSfCKUUCIAFU.SACStJ'FILVAOVIAl^ 
VLASYTSTCGFRGVVRTPVrOACFU«IAVLVCGVSVWI-SVPKSt.sVLnpPr^<;r.P^ 
WIFHPMLFMLVEOOMVORCVAASSPKRI^AAVGACLVLLLFI^IPIJXCSLGAK^^ 
GCPLIOTIAYFCNPSl^VMAAAICVAIt^ADSLKNAVSOLIAEEYPTLKAPYYRYLVL 
GIAVAAPLVAIGFTNIVDVLILS-rSLSVCCLSVPVCrYLLAPKGRRVSGAAAWACVLVCA 
UrrCWVOIVSt^FGELIJVWVCSLVAFSFVCFIEITXWNKVKTOT 

CPn.0487 569833 568112 

CT384 hypochecical procein 

RRTOCISLTYSSFRWASFRCYSLIFFCFCGSLFCSESLRYOLLIODFAKVSEBCICUXS 
KEYSLUJAKLVLRALAONSSFDIWFRSFKKCO I SYPELAHDRDVLEEFCIOVUUCIENP 
SVTVRAVSVlAICLARDFRLVPLLLOSCNDDSArVRSLALOVAVNYCSESLKKArVELAR 
NDDS IHVRrTAYOWALLO r ESLLPFLRERAaJKLVOSVERREAWKACLELSSOFLCrCV 
AKDO ID0ALFTCEVUUO4L? ETTE r FTELLSVEHPEVOES LU^ALAWSHOLONHKEFL 

skvhhvmctspfakv-rfoaaallhlhgdplcrdslveclrspoplvceaasaalcslcih 
gvplakehleslssrkaaanl3 1 lllvsreoi eracdv i arylsnpemcwa i eyflwdaq 
wnlrgdtfplysdm i kr eiorkl i rllavary30akavtatflsg00a0cwsffscmfwe 
ecdvxtsedlvtdacfaakl£gala3lcokkcxiasu3rvs0lynosrwodklailesvaf 
senloavpflldcchheap<:lrsaaacalfsifk 

CPn_0489 570147 569767 

hicA-HIT Family Hydrolase 

RKLPTCFAVNVTRSRDHHnTKO 1 1 DGL I DCEr/FENENF I A I KDRFTOAPVHLLI r PKK 

P I PRF0DIPCDEMIU4AEAOK : VOELAAEFC r A£3CYRW r NNCAECOOAVFHLH IHLLCC 
RPLCAIA 

cpn_04y# '•."ion rnuvt*, 

CT1U7 hypor.hor. t.:.ii cutmn 

R f VFAI ENYF.TLVfMEMrft.RR I'mjlPm; I'mrOCCFI (ADEVTA{:Ar.L I IFOLVOE>nC I 

tR!:RDPW[„':Kt;R\\\"Ovi\^-Y- rEMKRFOnnov.'iYDf :.'M:::;Ai :m cuiylkfr^mdcee 
YHKiMrrLviKn/Deo«w:RFK.:KE'';Fr:.T::Di tKrYNr-ffEEKErrrt;uADF:a'AUiFTrDF 

LL*MLHKKRjYDf<Vt *U\ W Vn F^\METFXMt:r /^FD»>I't>V</ENFFni 7;RKI ! rAAf-VrFP*;CD 

<jw I utr: J pfNi.Df<RMi vMvryrFiwvji.fi ;KE;«':Jcvr/: r roAVFf :i(Kf :t,F(.':w/™KF'ir 

i.'PfiJM'jn *. ijv*. './iti: 

i.-riH7 t(yi<trt» i I. -.a |M.>r.*rifi 

iMYNLUiAiii(iJAA::rn :iMA::Mf ,rKi,::pii I YK':£.VL ii-riirAYK r/:niM V." ' iuvnlk:; 
::i -Mju;viwvu« it.Ki jikauk r.M- 1 j tvi >-m: y^i a- r atamlelu-:!* ;;:f-vi ;Kr .ka.\j umnu 
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rLPLM';K;;r.TPrHI.KIRKrLPLYOMVTnPPPVr'E, LLrKTEPLHrRr/TARWODU- 
POf^LRIlTAAO r LEPrrOESCD t YEF^CTrriEP r ER : PLEFFTLEPYKEHSFFFYROKWE 
-'E::POE:VFtTVFE3tPEX:EDOAAMFt.';KC3ELL£L50DSWIlKPR23P3DERHABEI0KH 
[=-MPCFPFLlO\METDHIT30f:Vt.F3RYFPSA.';LKCMFLSNYSIiyYL0HIYF0IPSPTSC 
«F?NRD!UFLl^LYFACI3VFVADLE3KRUX3YrKRRNKDVCMFVPW 
TtHCSCLIACDYDEFLRELLTCKKTI^OOFTrPEFPPCTrPLAILTaXSCAME^ 
I • ';lC7TU\YVEAKKSYAI POLLCTOADFHVDlJ^VFVICX^MCTOra^ 
^TLKTCKKAU'rVFLrCPVDYWKSKITALYNGrmAVGTIRG 

••.;:irr: ; •.*r '-i';vrr:r':r/-v 
':Pr..04Vl 574')VS '/OiJ-^ 

DHLWNYENOraTCYVOSUXMHFUJSim^rviEK^^ 

PFICAVEICERPVCECmSSAERPtXPKETrUWPIFC^^ 

ROVTT«ACIRFNEKVVCrmvCATIFCCDFrLLRUTJVSRFHy^ 

PESC«VNSDFrVAGLWSGAIDKV«ninU>Wl^SHLOTEriLTHPNrPRna^DEXOTLf 

:3FRYTP0IRLYGGCCYrvSRCLTFPERPFYCEWCyV£UlPFCUlBC»^^ 

WEEOKTCUSQSYIUWEWAJCrOEIGRKIRAVLEVHOCFSmXJFIREPCNYYGrRLTO 

CPn.0492 574643 574804 

No robusc homo log presenc in Genebank/ML as of 1 1 /J' 9 8 
LFSLIFPICEERNSQOTYKHLHVESACFLLESPLKIHWSSPYCFPPFYRRDLKL 

CPn_0493 575U2 574855 .,,.,.00 

NO robust honiolog presenc in Genebank/EMBL as otV^l/SB 
SKTECSHSKTSKCFVCRFVOWIRTProRCSKKRSPSSFSPTHPYIRUriYrRSPKOSGVE 

RKQEDAETSFlCTPKCIUaCPOOCDPKGKHVHWXDS 
CPn_0494 575370 575146 

No robust homo log present in Genebank/EKBL as }}(ZLtt^* 

VIMIR\rtWCSVTU:aWPSPEKKKDVPLSC3iSRLHRR0GIRRK)^ 

SLEKKVKCISEAHFK 

CPn_049S 575507 576793 

aspC-Aspartate Aminotransferase 

RittjamOKMAIQKACAFIJlClJ'SESRPVLEHAMRJWPHrSlXKPOY^ 

ENPEI IGOTTOPLCRSirOAIKEFCVSOEKQErVRGrGPrrGLEKIJmcI^ 

YENRISPEEIFISDCAKPOIFRLFSFFGSOCrLCLODPVYPAYKDIAHITGIRDIIPIAC 

3JCETGFIPELPITOSLDItrLCYPNNPTGTVLTFQQL0AL\^AN0HGTVLIFIlAAY^ 

VSDPSLPKSIFSIPEAJCYCA:E:NSFSKSLCFTGMRI-AWN\aPKELTYriNNEPMimWKR 

LfATTri«3ASLIJ40EAGYYGLDt^PTPPAISLVLTNAOKUaCSl^ 

WVELPEGISDEEAFDFFLHOyHIAVTPGHCFGSCGQGrVRrSALTQPONIALACDRLCTA 

3UCETKVLA 

CPru0496 576751 577812 

CT391 hypothetical protein 

PPLYRrrKRNDGSOrrrUUCLSOYLFFrSLFCSFIYVATCCSOPOSVSSPKIAIFLSFPH 

PLl£DCSKSClETLKDFanJ»EIWUIA£DSlVKARKIARSUm)KNWArVTI^ 

VMSHIETQKmYAAVPDRESLTLPKmWIVGViryrLDINOVCFMOAVATNAOSIVVL 

KPSEPFPSDLOKEIVKKLHASCIEVI tIS ITSSTFKTR IROAIDKRPSAIFI PLSPLSHK 

ECTAFWEXIJCEKIPIITDDTSt^SECACIACSVDYKKSGKOtAKIVfm^ 

RKlIAQRI^PTTTFNEDIIKYLGIKLHKTERWFLSnCSKKLXKSEKGK^ 

CPru.0497 578107 577820 

CT38B hypothetical protein 

IFQRWLDOSWrLEVKVTPKAmnCIVC?[XWAlJOm\n'EPPEKGKANDA 
LPKRDVTLIAGETSRKKKFLLPNRVQDIIFSLHIDV 

CPn.0498 579062 579085 

No robust homolog present in Cenebank/EMBL as of 11/7/98 
YCRLRRAPFMNRRKARWWALFAKTALISVOCCPWSOAKSRCS IDICYI PWNRLLEVCGL 
PEAQIVEDLIESSSAWVLTPEERFSGELVS ICOVKDEHAFYNDLSLLHHTOAVPSYSATY 
OCAWFCGPLPALRORLDFLVREWQRGVRFKK rVFLCCERGRYOSI EEOEHFFOSRYWPF 
PTEENWESGNRVTPSSEEErAKFVWMQMLLPRAWRDSTSG\mVTrLLAKPEENRWAN^ 
CTIXLFRSYOEAFPGRVLFVSSOPFIGt^ACRVGOFFXGESYDLAGPGFAOGVLICfHWAP 
RICLHTLAEWLKETNOCUi I SECCFC 

CPn_0499 580404 579205 . 

NO robust homoiog present in Genebank/QIBL as of 11/7/98 
uvyu.ifyfc:ncstmssvnossgtpnpeevtspesteenknwssoeaoathavalpiv 
tolslpegvgtsseetasnprvdeivaevsssravadoisslvervceixddlkcaoslf 
tsf0seij(nclpawcsstrrlsritlgagt»jaoiarlelfrsdyeavlchan0fhgk^ 
sicltovhhkl(x3i^redi^lj^dnndrvlehlgslcldvtjaecnwsl3cer^ 
dsmlvoikkvm-ptveeijttuotresssdprveesl^ccerllnelrrlwanfvgfiss 
cydnivfvlmwi vrr i nllpglcclpfhnpdasoedorsssgerstrrerlsrrsolsee 
em tvraeces i kpespkgxrnopsrcdkcdsoseeetel 

JPn.OSOO 580647 582362 

pro3-Prolyl tRNA Synthetase 

OPHSMKTSOLrrKTSKNANKSAAVa^NEUXKACYLFKVSKCVYTYTPLL^ 
:R EEIJ4AICC0EUXPLU<NAEWHTCRWEAFT3ECLLYTLKDRECKSHCLAPTHEEV I 
C-JFVAOWLSSKP.OLPLHLYOIATKFRDEIRPRFGLIRSRELLHEDSYTFSDSPEOMNEQY 
EKLRSAY.sk I FORLCLAYV IVTADGGKIGKCKSEEFOVLCSLGEDTICVSCSYCAN I EAA 
V;^ I PTOMAYDREFLPVEEVATPGITT lEALANFFS I PLHKILKTLWKLSYSNEEKFI A r 
JMPJ30ROVNLVKVA3KLNADD [ALA5DEEI ERVLGTEKCF tCPLNCP IDFFADETTSPMT 
?jryCACNAKDKH YVNV^WORDLLPP0YCDFLLA£ECDTCPE^^PGHPYR I YOG lEVAHI FN 
I* rrR*n'a':Fr/flFODEHCQTO?CWMCTYCtCVCRTLAACVEOLADDRG IVWPKALAPFS I 
•r C AFrnr :OTVr-OEUETl YHEWiTOCYEPLLDDRDERLCFKLKDCDLIC I PYKL£LCKSY 
O::;?"; I FEI E::Ra;EKYTV.';PEAFPTME:ONMLA 

iuf:A-imi T[ inr.ct ipr it,)it.tl fti.*ptonsot 

t LLTFfTf :::r-p t ML:•^T I VLVr;Ll-MAR::KVSKRO:iK ir t LFATTELVLKTCOP'/'TSKTLK 

(•:::r;:;ia-- tat t HNNTAi;L.FJVii:FLKKNitT;ra :r i r-TDLAU^iiYvniioKEcr fjvei:;ap i 
Kt'K i:;ot.i*;:E;:f (N 1 1 Ki>i /.ikatellgei LDLPTFF:;:;rp.FFND: WTN 10 rrovoKORAvr 

I UrrEff?^! m/rtWLl*lj\i:Crru: IKRIEKFI^NY rPKLPTNEEUiKKEEHL^MTXYNEV 
WhYI.TrtVi'Nt-rflEOt.YyO W::Kl.tXYEAFKOPEV'LAt/:L::LFENRRC«c",F.LUUCMHKC 

Kvi'AF I* iKKLL.uiiirri'Att- x:jv tT I PYYwmiipuiAt/; i ujr inm'VKFaum.lklfan 
K I r I .t v: ; fy k k k 1 : : n* net ,t; :nl'k u :nf p i lrt err: :: i k llp: ; k fti'i . 



OKER0El>CYALElTrJI0FLNPrE3H£faXi:FATCN^ 

KGaEYSSICOKFNPFLHEAVOTEETSEVPECrrLEEFAKGYKICERPtRVAKVKVAICAP 
TPKENKE 



CPn_050 1 



Sfl4225 



??ii2l3 



••: r-'.-vMKr.' •:: v--. :•^;r-:^■r7^.*:v^^'"■r:^:KLv^:: 

PAKROAVTNFEKTU;:>'TKf'.F lOKK t J tv AJEI ^TV*- i-r/T^sjijKkjOAVrENfTXJKOYTPEE 

tCAOILMKMKrrAEAYLCETVTEAVITV-PAYFNDSORASTKOACRIACLOVKRIIPEPrA 

AAI>YCIOKVCDKKIA\^0L0CGTFO IS I L£ICIX;VFEVLSTNGI7rUjOCDOrOEVlIK^ 

Kt ECnCKOECrOt^KDNMALORUCDAAEKAKI ELSGVSSTEINOPF tTMnWG 

LTRAOFEKLAASLIERTKSPCIKALSOAKLSAXDrDDVLLVCGMSRMPAVUrrVKELTC 

EPNKGVNPOEVVAIGAAIOOCVLOCEVKDVUXCVI Pl^LCI ETLCCVm^ 

TOKKOr F5TAADN0PAVTI WLOGERPMAKDNKE ICRFDLTCI PPAPRCHPOI EVSFDIO 

ANCIFHVSAKDVASGKEOKIRIEASSCLOEDEIORMVRDAEINKEEDKXRREASDAKNEA 

DSMIFIUElCAIKDYK£OIP^^VKEIEERIENVR^UlJa)DAPIDCIKEVTO 

GESH0S05ASAAASSAANAKGCPNINTEDIJCXHSFSTKPPSNNGSSEDHIEEASVEZIQN 

DDK 

CPn.0S04 586418 588514 

vacB-ribonuc lease family 

ATOFTSET^GFLVOCPKLTCGAOUJCKPKRXPCRRTYCKSLKIFIP(7I^FVHARI0CFC^^^ 

SPDNPEEYPn)irV^ARDUtGAUCTMVIVSVLPYPRIXMKLKCrrSEVIJUl^^ 

ITSLVSPTSALAYTSMSGSOSLI PVELLPCRTYKICORI LLSTPPWVDKPOBCASPALOM 

LEFICMITHAKADFOAIOAEYNIJ^EEPTPEVIEEASLrSOKHITOVUiSflXDLRD^ 

IDSSTARDFDDAISLTYDHWmiUIVHIACVSHYVTPHSHLDKEAAKRCNSTYrPGKVI 

PMLPSALSOJICStiCPNWIUAVSVFKrFTKSCKLSDYOIFRSVIRSXYRK^ 

EKKHSHPLSKIL^rtM^TI^KK^SDIRE£RGCIR^VLPSV^MSLCNLOEPVALIEJ«OT 

Hia.IEEFMIJCANEWAYHISHQCr/SLPFRSHEPPN0EWIiXF0ELA»M3rO^^ 

PDYQVLrOTSACHPLEQVUCSQFVRSMKTASYSTENKCHYGLKIJJYY^ 

HAYIITAf«EX2^FVVTEFCHECFIAAAEIJKEYSLK^^ 
SVNLLTQKIVWS I ATTTEDKPKKIKKrPSKKKGTiaCllAS 

CPa.0505 588471 589X06 

•3-metft/laclenine CNA glycosylase , 

RKRUJUOCERiCKEPRNVWEHrFl^EDVITtAQOLLGHKLITTH^^ 

GPDDIOVCHAYNYRKTORNRAKYLKOGSAYLYRCYaWHUWV^ 

DQGXEUIlORROWROKPPHlXTYCPGKVCOAUSISLENNRORUnPALYISKEXZSGT^ 

ATARIGIDYAOEYROVPWRTLLSPEDSCKVLS 

CPru0506 599055 589940 

CT421 hypothetical protein 

CPMElSPIPRRFCKSFILNWJCLYSKETNAHFLISCRRIMKKYrrrGLVILLPLAITIAl 
VTODWTLTOPFVStASEFFEKFSFYTimRALUCrvUjr ILLFGtJ^rATl^^ 
FKSU^ lYDKIUmiPI IinVYKAAOQV>nTirGSKSGSrK0VVHWFPKANTCtGLVA 
GDAPTVCCTGEKEDOPLWFIPTrPNPTSGFLTLFRKSDIVniJMKIEnArW 
LSTPHACPSSPLPOELKQDQGS 

CPnu0507 589898 590122 

CT421.1 hypothetical protein 

STPYPOFPLSGEIKKFNIELFKlTlMSKOARRRAKSPKKRKPKYAIVHPAPAPRIVYia^ 
NALSTSOSiriPKIC 

•CPruOSOS 590133 590300 

cr421.2 hypothetical protein 

SRrMSRHRSYCKSWG^mCRN^aJalFERVEVLRIaJCRW^roSTAK^^ 

CPn.0509 590299 590808 

(predicted Mecalloenzyme) 

NKrVTLYGNFIRVTOEXIKrHVSNECnCIPIHLVSVEKLVLTU^UCVri^ 

DKAIJ^ELHDK\^ADPSLTrrriTLPIDAPCDPAYPHVU;EAriSPQAALRrLEm'SPNOED 

IYEEISRYLVHSIUlMLCYDI7rSSEEKRKMRVKEW0ILC>ttJlKKHAU*TA 

CPn.0510 590804 591973 

tlyC-CBS Domains (Hemolysin homoiog) 

OUJMUilLLAIFCILU'lJa^TOPSCHCSSKFTJCTLNORFFKDKGREYPPFPSAPTILA 

TLLCILYCAUH'KLYTIiPPKTAHKDLLFWPLYSLSALIAYGFLPPWISTKVPKErTAHL 

RFLASVFOLGLFPLOLLFYRRRPNOOVRSSTSrOSOLSEALSAFDMLIVREVMIPKVDIF 

ALPEETTLOEALVLVSEEGYSRVPVYKKNLDN ITGI LLVKDLLLLYTSSHDLSOPISSVA 

KPPFYAPEIKKASSULOEFROKKRHLAriVNEYGFTEGtATMEOIIEEIIGEIAOEHDVQ 

EOTPYKKIGSSWtVDCRfWISDAEEYniLKIDHENSYDTLCGHVFHKVGAVPQiaaWIH 

ENFDIEIITCTERNVCKUCITPRKRKRIIS 

CPn.05ll 5?2141 592488 

rsbv 'Sigma Reguiatory Factor 

HSDIOKEEHGSTTirHLHCKLDCISSFEVOENrSOSLAACSKNIILOCAHLDYMSSACIR 
VUjOJrfHQVCOHSGKIVLTTVPKTIEQTLYVTCFLSYFKIFOTVOEAIQTLNKIXm 

CPn.0SI2 5-^2538. 594412 

CT425 hypothetical protein 

SLPLTMRRSVCYVNPS tARACO rSTWKFLYSLATPLPACTKCKFDLACSGKPTWEAPAT 
DLSOTtUIVIYAEMPBGEI I EATAIPVKDHPVP0FEFTLPYEU3VCETLTIVMCASPKHP0 
VDDACaiGAOLFAORRKPFYLY I DPTCE^OEPCVFSMD tFGNVLKKI EX FTPSYVVKNK 
RFC tTVRFEDEFCNLTNF::rEETR t EL.-YEHLRC«JA/OLFI PETGFV I LPNLYFNEPCr 
YR lOUCNLSTOE r F I.':AprKCFAOSAFf ILMWCLUfCESERVDSEENIETCMRYFRDORAL 
NFYA:;r:SFEIOENU:rorWKLINOTVnDFNEEDPFrTLCCFQYrX:EPHLBCVRHILHTKE 
TKSIL';KHKE-/KIC l rU\KLYK:rrVNHDM r :; I rr^FTASKEKCFOFENFYPEFERWEIYNAW 
' :3::nTAAUWPFr tC<:KT:5EDI*Rcrn/ 1 nCLKWrLRPGr/AGGLDDRCIYKOYFCSPOfVO 
Y::it;r;rAt ia«YTRFj:i.VKAI.KAmn:YATn ir-P EVLUFttrTCAPMOSELSTCSKPCLW 

NRM I : y ;MVArrrALt.K-[VK I r r^^K ;kvi.i cTFFrDr:: iriLDVEYODMvrLi:.':vTUCDPtnKAPF 

VFYYI.H^/TC^AI**AHAW::. : r I WVW U 



VI • h •( j<' wkno i aakkki ivw/ rrvi jyi « .er t vi (0^"vi roMTTc:i*t'OPPKT.':pLY2 

I K W I dJVjEU U ICEfJAI J U.I .1 J.'ITJKKr/yf^r Mt IKAOOVPKOP VCaPVYYI^TT t.YLY PTNF 
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TK [ Hi^f or 'J t H [ KALTA [ Er^-AYUJOLDNU; t R O: -KOACLDS I TOCAE I LVDK t RN 
F WPKR U::' JDF[^IHm\IIOU: IHSM ITMUrYHKECPEDLVTHMV^yHDUJDCTOCnCN 

F r LLKrACEN>WU;KRLRK.730CHA [ PU<3LHAVAR C FLDNFSNMKALWNYLGIE^^A^ 
UCGANDU';.Tri IMCEKVFQMACSKEP r KMDAEXWAALITOCXIRTPCLTNSSHV 

CPn_05i: 50S690 WS20 

CT427 hypor.nec ic-iL protein 

"NnnPHHTTRENAMrrr (CLOPC :r;Lnc/Tf tnsfpc^lcl r krno i rc-oj^ppadllnlli 

-M.w.- ::. : :..;'.:::.VA.\rTrr::. - •■-■a*:-"!^- 
: v:.:riii.wi ' --Tr.':.. , r'-c:--- v.LL:jnAA:.-.;i: v--r':r.-rvc:. 

ASC>rfDLTKLPFVFALL:j{JT3WKEHPLPNLAM£EAL00FE3SPEE:yLKEAH0KTCLPPS 
tX0EY"^ALC0YRtJC£EH*/E3FEKFR£rOTLY00ARL 

CPn.0515 596450 597181 

ubiE -Ubiquinone Methylcransferase «.«„,«,f^*. 

EKfrrrKALIOJSCNIMEPSTNKPDCKKI FT3S rASKYORTTrriLSLCHKHTW^ 

GYSLLDtrACTCKVAKRriAAHPOASVTLVDFSSAMLDIAKOHLPOCSCSFIHSOINOLP 

LEM4SY?LAAHAYCUNI^DPHKALOEI3RVUlPSGKLCILELTPPKlCrHPTYSAHKLYL 

RAWPWICKSVSKDPOAYSYl^KSIQQLPKDHDLEDLPSKSGFYIAKKiCKLrLGAATIWL 

LEXQ 

CPn_0516 598904 597255 

No robust homolog present in Genaban)c/EKBL as ot 11/7/99 

R I S ISFRVSWFVKI ILAVWRAIAKAYYVWARGLCDFPTLVPNERLPICPFFVPOHTS 

GAKCKEFAKRNFS I ISGLDDILKLCILORRPFALOWDKLSVKSDYEEACPAIGIRSLEPO 

VSOISPAHGRUrSTLVQWAPILGSEEOLVWlXrrKKRUCFPKSLGSKDAVIVDSEMVPVN 

ANPTOEI PAASrrVESSPVAPCimOTIPAASCTTDTTSGVSEAAAAEAAVDSTPGTEEE 

?SFSUlYALV\n3N\^PEPPKEPEVMrn3EEKSLII^TRARRK£U)LYNCTI^^ 

DErOKHWDLPENWRTrft^WSERLYKFFTKTXKn^LEEimnCELGNHILA^ 

ARIKVrNSLVAWLLOSFVVGRSCTAKPLPTSKLDLFKSEFESKPJOJNILTEFLVASOEEI 

LFKCIAVLEPCIEGWYDHPOOAGEIRSVLBGLVOAGRISGYWEWPFCRFVLRGVCERRT 

ELVELLESLVASGEIMQrFESSOEEGAFIIDNEPSKTAMLKORFKSCVRTKLVCSFADES 

LPRCRTTILV 

CPn_0517 599637 593795 

No robust homolog present in Genebank/EKBL as of 11/7/98 
F IMSSIJLSCCRIEPTRVTCSUCTYLEtTrSONQl-STRLVRASVIFLCALLI ILVCVALSSL 
I PSrMA^J^TS^^V^«LIUVMSUJGDVAIISYLTYSTVTSYRO^^alAFEIHKPARS^^^ 
GVRHWDU;RSSLCrrcEI?IVRTtJSPFWHGUWALAAKIFXF«EHFSPEPPNEPLV^ 
CLIRDFRPHVSSLCFVIEKOCSSLRTKBCNTICEArRSDYtlAHFAMVDCYRLIHSKLI I E 
KMGLKN IDI I PSVMVREDYPSRPGEGYRECLLRMYGGKGAI. 

CPn_05ia 600806 599832 

CT429 hypothetical protein 

F»fIYPVPCNPLLLRILRUffiAFSKSDDEIU)FYlJ3RVBCFILYIDU5KTO 

EENAERYO-IPKLTrifEVKKIMETriNEKlYDIDrrKEKFIXIWSK^^ 

AEI^W^QOFYVERSRIRl I EWLRJWKFHFVTEEDUJFTKNVlXaLKIHL^^ 

ARQU-SNKAKmSNEAUVPRPIOKSRPPKOSAKVrtrrm'ISSDmmOAARRFLFLPE 

ITSPSSITFSEKFDTEEEFlAmJlGSTRraJOLraTNI^ERFASUC^^ 

DFFGDDOEKWTKTKCSlCRCRiCKSS 

CPn_05l9 601707 600904 

dapF-Diaminopimelate Epimerase 

OPTKUlILVVWKAFYSPSTISKYFIYSGAGNRFlXCmPEVErWRFlXOETRVDGrL 

KPSSCAnA0LIIFNSIX3SRPTMa;rGLRCAIAHLASOKGKSDrSVSTDSGLYSCYnrSWD 

RNrtAmmAimfcSVHRlXSRPDPIJ'KEVVCIHTGN^HAVVlLPEimDLSI 

HQT^SPDCV^^V^^^:LGHCOUlVRTYERCVEGETAACGTGAIJ^ALW 

IHTWGGEtJfrVSONRGRVYLOGSVTRDL 

CPn_O520 602233 601646 

clpP-CLP Protease 

ERHmiArCEWKLRDIIEKEliEARRVrFSEPVTEKSASDAIKKLWYLEUCDPGKPI^ 
VINSPCGSVIWGFAVWIX:iKMLTSPVTTVVTGrjU^MCSVI^trAAP^ 
HOPS IGGPITt^ATDLDIHAREIUO'KARI IDVYVEJVTKOPRDI lEKAIDRDMWOTANEA 
KOFCLLOGILFSFTJDL 

CPn_0521 603803 602241 

glyA-Serine Hydroxymethyl transferase 

KSLLKVFEKFKKFAIVEirTKWAWSUJiKFLENASGKKCOSLASTAYLAAUWLUV^ 
PS IGERII DELKSORSHLKMIASENYSSLSVOLAMGNLLTOKYCECSPFKRFYSCCENVD 
AI EWBCVETAKElJ'AAIXACV0PHSGADANlXA\mAILTHKV0GPAVSKU7fKTVNELTE 
EEYTUJCAEHSSCVCLGPSLNSGCHLTHGNVRLNWSKLMRCrPYDVNPDrrECFT)^ 
RLAKEYKPKVLIACYSSYSRRU^FAVIJCOIAEDCCSVLWVDMAHFAGUVACGVFVDEENP 
. rPYADIVrrTTHKTLRCPRGCLVLATREYESTLNKACPLMMGCPLPHVIAAKTVALKEAL 
S*/DFKKYAHOWNNARRLAERFLSHCLRLLTCGTDNHMMVr DLCSLGI SGK r AEDI LSSV 
G r A'/^^WSLPSDAICKWDTSC I RUrrPALTTLCMCI DEMEEVADI I VKVLRN IRLSCHVE 
G33KKNKGELPEAIA0EARDRVRNLLLRFPLYPEIDLEALV 

CPn_0522 603825 604655 

'^4 33 hypothetical protein 

REPLSPEKTSLAFKVKNVNORM IKKNOGKKKNYFOY I PLKVOKLROPS FYPKRLMTLYLG 
LrlOKTARK-rOAHrLP I LTLFPYAK3TP0NKRAL0FLFOATHVI LTS PSSTHLFLSRMTSL 
LSKATUCTKTYUriCESTKERLl^FLGOVKYVVATOEIABCIFPLLOALPSSARrLYPHS 
;:LARPVtREFLYNRFTrFSYPHYTVKPRKUCKNILSK'rKKnrTSPSTVRAFAKIFPRFP 
EKrAtCOCRMTLQEFCKrsSOKQVSLLETLCKSRTSP 

CF-n.052i ft04720 605052 

rk/ robui;n homo loo present in OenebartK/EKBL oc U/7/^fl 

I 'M/^L'ATrGFD3TAP;; LF P PATRPRYNFKLALFVT : A I ALVWI AL r ATT t A ICU: IHPLC 

.JrlFLTAtPLYFlSRYtC-'HYARNVYtALDWrDHrjKLODMRCHSPtFCDR 

'•f'r._0';;M MlMIT) n0('.l7'i 

fl', znlfiVAt intiintl.nt pujiurnr ill CifneCvink/liMnL .u; ot ll/7/*iH 
f-v.-vKKtr/M; v.r ::jiTi::*.. vv::vu:'ATRDKEi apkkoft tAK [ :.TL.\i LA;;iw\u;ALVAi; 
: .:lt t vf /;NPvn JiTAi.n:v*/rrLWiiyhn'::K7::nNwrjKV[.iiONFKpLi;K/\woFJCN 
7!/:v::^^^>l'.•^■'YH^^lLNl'K(■•KVA tf/rDA:'<>F-FnrTFLr:LRV t kknoitc; r i FNrvnpTNL 
I urrrAi'tif j:t 1 1 .Y:rrt .Kt tK: :vwmi:KoitEonPAK.:F.r*pFr;r'PWRWKLPMEALDOTFNL 
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rT39rt nvpntnectcal protein 

KOAOCEDLIVSUCESUkSTENSSSVIEKEIFESrKKrNEBCKALI^RTELKHATNPEU 
:7 : YERLLNNKKDRVWPt DIRVCSCCH r^'LTPOHENLVRKKDRLIFCEHCSR ILYWOESQ 
VNAQENSTAKRRRRRAAV 

^: *■ - .* .-'i i »: '* : ^ • ■ 
VFNL I tho K/ZCYUiUti-Jw-KENKMr ;*M I JTDVXTCC ; UJKwKtAVDFrFOAFOPKEAM 
0LAEKIU:HSCWVFr3CVT:K3CCVARKLVATL0St^ERAI^FSPVDIXHCDLCLV^ 
VCU'SKSCETOEIirnVPHLKSRRAILVAITSMPYSKlJULSDLVVtLPSVAEI.DPFNLl 
PTNSrrCOMIFGDFIJOttJLFHSRCVSLSTYCKNHPSCCJVGWCAfCI^ 
HUIDKVSFSLEVFSAYGCCCXIVDPOFRUCirrDCDUlRStASYOCEVLSLSLEIC^ 
ANPRCITEDSDIAIAWLMESSSPVA\rt.PVLDNEENRHVTCLUOlHnJUCAC^ 

CPn.0527 609910 608726 

su cB-Oi hydro lipoaiaide Succinyl transferase 

RYMIFEFRFPKlGCTSSOCSmWLKNlXIDHVARDEPLIEVSTDKIATELPSPICACRLVR 
FCVNCSOEVASCDVLGLZELEEISEADDESTSCPPTSCCTKSEACSSSSSVWFSPAVLSL 
AORB3ICIJNL0KIAC^^CaCOCR^m^0Dl£AYISESO0VS IPEIFQGEVNRI PMSPUUU^ 
ASSl^SSDEVPHASLVVtnroVTDLMNI^SGER0RrLDTHCVia.TlTSFIVOCI^^ 
FPLIJCSLDCTTIVHKKSWMJVAVNIJWBCVVVPVIHMro^ 

LNKLDPSEVODGSVTVTNFOflGALICMPI IRYPEVXILCICTrOXRWVRODDSLAIRK 
KVWTLTFDHRVLDCIYGSEFLTSLKNRLESVTMG 

CPIU0S2B 611165 609921 

9lcT-Clucafltace Synporc 

LMXIJWKKIFICIJVCVTLCI.VIXDKAIFFKPICDIFUJU^^ 

DMKKIiGRIG IKSVCLYIXnTALAIVICLCFAWIFSPGNCCOFAOAOS^QSAVTVIDSNKT 

AAYFLSIIAOVFPSNFVRSFAGCailLOIIIFAirLGIALRLSGERCRFVQtFXOGFSCIM 

LRMVmiKSFAFYGVCASMAWZSGNKGLGVLWOLGKFI I AYYIJO.FHATLVFGGLVRFG 

Cia4SFSKrWS>0flJAISCAVSTASSSATLPVTMRCTraKmjGVSAEVSCFVU»Ifi^^ 

GTAIPOGKAAVFIAOAYNCPLSLSSIXU.WATFSAVCSACVK»airTLCS\aA^ 

PIQGIAIIJU:iDRlJU3IVGTPKNILCnAVVATYVASGBCELSPYESIK0ESVETT 

CPrv_0529 612298 611165 

ycaH-ATPase 

FSCKEIRAFKRGTMKJCRFPSTUTJnfRRVTIArSLECILCl^^ 

RFSMSTPYRARSTVIS^rtanVVOGMKTPT^rt*^lAEALR^J^GYSOT 

KLT^Am5KVKSA5yVGD£PIXHAEXLPECS^A*A/HKDRAISAARAA£KFGILIXD^^ 

KLHKCmAVVNGOOPLOCRAFFPKCRLRDFPLRLKTVDAI IVNGOGKEACrVVKRVSNA 

POIFVKPTIASVVWTHICERIPKEAUIEUIVCVFCCUJFPOGFU^^ 

POKAMTXm/rrFCQGKAMROGOCLLCTEKDSVTa^PRI^EVSLI^^ 

DTLSLLNHZEQIH»1RGH 

CPrv_0530 613323 612460 

spoU-rRKA Mechyldse 

5^ArIMGKFLWRRCCSXAFV£FCSHDCIGKK^IPLVK£A£AUCRSRCRX5SWFLV^^ 

KAJUmrnjC0HVFCSTHl^EX£X£FLYEIJCRN5TKILYCU}STIJ^l^fTOKHDSFV^^ 

OKRVWNKEDFLIORKHAOPrYLIIEOVEKPGNVGAIXJllADCACVlXraLCNP^^ 

NVAmSSLCAVFSI^II^I5R£CGKELFK0EC>rr/FVTSPRA£7WFSK^^ 

EKIX^LTEIX^SESFSCZALFMUSESDSUnATSVAAVAYEVVRQRMVN 

CPa.0531 614198 613245 

SAM dependent methyl transferase 

DSSIOJDFRXEKCRRKSQYRDRYVNKinCTHSKTyrSLIRERLVMDYKli^^ 

CPNnrLIRPSSIAVWP!CSRPEIJi*SOA0L0YVRECERGAWKNrKRLPEEWEVAFSDVRCIIJC 

RTPFCTLGVFPEHMGFWPALKOAIEKHKERQVUnj'AYTGAGSIFAAKCCARVW 

AAVRWAQRNVEKNAFPERRIFWIEDVrSFUCKEIRRWaCYOVILLDPPSYCRCPDCEVr 

KIDKDLFPr^LCSKUJ«)OASYFU.TSKTPGHTPEFLRAIARRSVPTLVSEAWSCCCSF 

CCECVGALPSCSFVQWLA 

CPn_0532 614716 614075 

ribC/risA-Riboflavin Synthase 

ESFCCKDSVVKWOatrSGIlOELCEVCFFEAOGNCLSLG IKSTPLFVTPLVTCDSVAVDG 
VCLTLTSCMESKIFFOVaPETLACTTLCEKRCSDOVNLEAALKhCOSrGGHLLSGHV^ 
AE IFLIKEWRYyFRCSKELSOYLFEKGFIAIDGISLTLVSVDSDTFSVCLI PETLQRrra 
GKKRESERVNI EZOHSTKXQVCrrVKRI LASSGKD 

CPn»0533 614918 615385 

CT406 hypothetical protein 

EVAPKOCPFCrmGEUCVI DSRNAPEANAI KRRRECLKCSORFTTFETVELTLOVUCRDCR 
YENFOESKL I HCLNAASSHTRIGOOOVHA lASNVKSELUIKONREI STKEIGELVMKYLK 
KADMIAYIRFACVYRRFKDVCELMEVLt^ATPOMEK 

CPn_0534 615389 615784 

dksA-OnaK Suppressor 

LNFTRSKWPLSDDEIEOFKKRLLEMKAKLSHTLBCNAOE'/XKPNEATCYSOHOADOGTD 
TFDRTI SLEVTTKEYELLROINRAIXKINESSYG tCOVSCEE IPLARLI AI PYATKTVKA 
OEOFEKCLLSCN 

CPn_0535 615763 616296 

LspA-Lipoprotein Sianal Peptidase 

KRTPIWKLSSMATRFRSTLLVirLFVLIDWVTKLWLLOYKOLO I LTItPTLYTHSWCRFS 
F3 1 APVFNECAAFGLFSNYKTFLFLLR IFVrLCLLAYLFFKKKS I03TTCTALVLLCAGA 
rr.NVCO t r FYCH t VDF tCFNYKOWAFPTFNVAD'/LI SLGTLLLVYK FYFPTKQTEXKR 

OPfuO'/t^ hlbJOO ',17601 

'j.iqA-f/ At.i«c;iy Prttmf*ase 

YR.';i-Of:f I .RVFFKTVMNRLLSLL3VF0DFFW.rr/AFlLI I'/LGVCF^VKSRFTOFTKFSO 
KriKLFP YY:;0Nr\?ERETKCS:VHPtXVFFA3W7;tItCICrr/A: rvTAA<; iL'or-ryvLFWwr 

N ; r Fn:: r vKV::fV^'i*UKFRKLDftonv/yy;pM'/FLiKAFKTPwm/rvA t i.r/: r yitve i 

rOF^V ITDr:L/Mtt 'WNI.rKVTFMt/IU.FLVr^A IftOCLOR rCK ICZ l'/UPFFMI.t.YrAU;L 
Y I t.yKRFirrLrilt.L:rrVF;;:;AKK/XWALGGFAGCTVATnHOG ISRAAY: A ;ii fi f^lFD:; t 

lo:iF-':^•AKDp:rlvA^.a-;IvotAIDNLt^T[J;LLKV^ASG3W5LCLF^A::ovv^l^^u^ 

I t4VKFFLI TKr>vri TY n* t I::YFl-VrKK':AKFLYCNTCAK I YTLYf LIM >r FUWr 
ALl.lR':v;;' ;,\UJA'rNl.UA'F tU'KC/ tFPAPAA3LTET3LrrrE 
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i-A PLMU ;KK I.L. ;^>^L^KRK^m^:."LJED I DEU*DEKKORUWKKNUXX; I KVOL^LVLIWK^ 
rPHKO 

:rrHl1 rtvp'jrn«''C I'M; procfiin 

n Jr.AOHWr:.':n^K'':r/ T K T KTLRD r'/MFPhWHK PK CTKCKR FRWLRCVLFCX^ 



CPn,0539 51B678 621545 

pfflp«n -polymorphic memorane procem 

"^>aiCLRHMK0MRLWCFLF1^3rCQVr/LRANDVLLPLSGIHSGEDr^LrriJ«SS^ 
TTYSUlKOFrVCDFACNSIHKPGAAFLNLKCDLrriNSTPLAALTFKNIHLGARCAGLf'S 

ALFFRDNRCTriLFLKWKAVNQDSSHPCYGGAVSSISPCSPI TFADN QEILrQENEGgJC^ 

ArY^^XX;AITFENNFQTTSFFSNKASFtX:AVYSRYCNLYSOVCT^t.^^K^^ 

ADYVH I RDCKCS IVFEENSATAOCUIAWAVCDIKAOCPVRriNNSALCLNCCAIYMOAT 

'^S r LRLHANOGD lEFCGNKVRSOFHSHINSTSNFTNNAITIOGAPREFSLSANEXmRICF 

YOP I ISATETATISLYINHORIXEACGAVI FSGAIU-SPEHKKOJKNKTSI INOP^^ 

CS I ECX;AILA\mSrrOeCCUJa^PCSKLTTOGKNSEKDKmTNTJGFKU^ 

: RATEKAS I E ISC/PRVYCKTESrYENHEYASKPYTTS I ILSAKKLVTAPSRPEXDIONL 

: ZAESEYWYCYOTSWEFSWSPNITrKEWailASWTPTGErSUSPKRRGSFimLWS^ 

SGLNIASNI'/NWTrUWSEVIPLOHa:\n^PVYjOrMEQNPKOSSJI^ 

I?rSFrr:XLSAALTOU^SSSSCX}NVADKSHAOILIGT7SLNKSW0ALSLRSSrSYTE3)SQ 

»/MKHVFPYKGTSRGSWRWGWSGSVCMSYAYPKGIRYLKWrPFVDLOrna.VONPFVTO 

YDPRYFSSSEKnJLSLPIGlALDlRrrGSRSSLrWVSTSYIKDLRiWNPOSSASLVLNH 

YTVroiCX;V?LGKEAI^ITl2OTIKYKrvrAYMGISST0REGSra-SANAHAGI^LSr 

CPn_0S40 621631 626662 

pnip-20-poiyniorphic membrane protein 

FIHLIYLSLIEFVNISDaFSSM}CWLPATAVrAAVt.PALTAFCDPASVEISTSKrcSGDPT 
SDAALTCrroSSTETIXrrTYTIVGDITFSTFTWIPVTVVTPDANDSSSNSSKOGSSSSGA 
TSLIRSSNLHSDFDFTKDSVLDLYHLFFPSASNTLNPALLSSSSSCCSSSSSSSSSSGSA 
SAWAADPKCGAAFYSNEANGTLTFrrDSCNPCSLTLONUWrGDCAAIYSKCPLVrrGL 
K^a.TFTO;ES0KSGGAAYTBCALTT0AIVEAVT^rC^^'SACQGCyVIYVKEATIJ^UI^^ 
KFEKin'SCOACGGrYTESTLTISNITKSIEFrSNXASVPAPAPEPTSPAPSSLINSTTID 
TSTLOTRAASATPAVAPVAAVTPTPISTOCTAGNGCAIYAKOCISrSTFKDLTrKSNSAS 
'/OATLTVDSST IGESGGAIFAADS IQIQQ Cm ' n ' L FSGNTANKSGGGIYAVCQVTLEPIA 

^:LK^f^wrrcKGEGGAI•rrKKALTI^ffJcyvrLTTFSGNTSTr»4G^ 

FSKNKTGNYSAP ITKAASNTAPWSSSTrAASPAVPAAAAAPVTNAAKGGALYSTBCLTV 

SGITSIUFE^lNECQNOGGGAYVTK^^OCSDSHRMFTS^lKAADBGGCLYa:DIWTLTOL 

TCICrLF0ENSSEiO^GGCI^IASGKSLT^f^SIXSFrL^^ANTAK£NGGGA 

TPTPKEPAPVTOPVYCEALVTGNTATKSGGGirrKNAAFSNLSSVTFTOOT^ 

TQKAADKrDCS?TYITNVKITNm'ATGrCGOIACXnCAHFT>RrDm.T\^ 

EDALILEKVITGSVS0NTATESGOGIYAKDIQL0Al»PGSFTITWnCVrrSLTTSTm.YGG 

GIYSSCAVTLTTlISCTFCITGNSVINTATSODADIQCGGIYATTSI^INOCNTPrLrSNN 

SAATWCTSTirOIAGGAIFSAAVrr ENNSOPI IFl^WSAKSEATTAATAGNKDSCCGAIA 

ANSVTT.TNKPEITFKGNYAETGGAICC lOLTKCSPPRKVS lADNGSVLFQDNSALNROCA 

lYGETIOrSRTGATFIGNSSKHTCSAICCSTALTlAPNSOLIFENNKVTETTATTKASrN 

^^LGAAIYG^INETSD^TrSLSAE^CSIFFK^WLCTAT^^CYCSIACNVK^ 

rraAVWSTKTrHAQELKLNEKATSTGTIlJ'SGErJiEI^SyiPOKVTFAHGrn.It^^ 

LSW5FT0SFCTTITMGPGSVl.S^mSKEAGGrAIN^WIID^SEIVPTKD^»TVW 

VSR11^ADSKDK:DITGTVTL^)P^CNLY0NSYD3EDRD^^m^IDNSASGAVTA^^^ 

GNLGAKKGYIJGTWNLDPNSSGSKI ILKWTFDKYLRWPYrPRDNHFYIKSIWCAQMSLVTV 

KQG ILGNKLNNARFEDPAFNNFWASA ICSFLRKEVSRNSDSFTYHGRGYTAAVnAKPROE 

FILGAAFSOVFGHAESEYHLDNYKHKGSGHSTOASLYAGMirYFPAIRSRPILFOGVATY 

GYMQHDTITYYPSIEEKNMANWDSIAWIJDLRFSVDUCEPOPHSTARLTnrrEAEm 

OEKFTELDYDPRSFSACSYGNLAIPTGrSVDCALAMREI ILYKKVSAAYLPVILRNNPKA 

TYEVL5TK£KGNVVNVLPTRNAARAEVSS0IYLGSYWrL.YGTYT I DASMNTLVOMANGGI 

RFVF 

CPn_0541 627137 628003 

Soluce binding procein ( -yebL-Synechocyscis Adhesin Komolog) 
NNRSSYOTAFVMHKVIVFI FLTLYSLKSYGNDVIDKPHVLVS lAPYKFLVEO lAEETCFV 
YA rvrrmYDPHTYELPPOQIKELROGDLWFRIGEAFEKTCERNLTCOOVDLSONVSLIOG 
KPCCN0HTT^mDTHTl^PKNLKVOVETIVTTLSKKYPQHATLYOS^CEKI^^ 
Er:.TITSKAKORHILVSHGAFCYFCRDWFSOHTIEKSSHVEPSPKDfVARVFRDIEOYKI 
33VI LLr/SCRRSSAMLJ\DRFHMHTVNLDPYAENVLVNLKTIATTFSSL 

CPn_0542 628000 629737 

ABC Transporter ATPase 

FHTIRILAEGLAFRYGSKCPNI IHDVSFSVYOCDFICI ICPNCGGKSTLTMLILCIXTPT 
FCSUCTFPSHSAGKOTHSM ICWVPOHFSYDPCFPI SVKDWLSCRLSOLSWHCKYKKKDF 
EAVDHALDLVGLJDHHHHCFAHLSCCO lORVLLARALASYPE IL I LDEPTTNI DPDNOQR 
t L3 1 [JCKLNRTCTI L^mHDU^HTTNYFNKVFYKNKTLTSLAETSTLTDOFCCHPYKW 
FCCSPH 

i:Pn_054 3 62R710 0:9603 

(Metal Tr^snsporc Protein) 

K:;C t FW L^'GLI RD5:FPLLI LLPTFLAALCA3VACCVMGTYIVVKR I VS I SGS rSHAILGC 
rCLTU^IQYKLHLGFFPMYCAIVCAIFLALCIGKIHUCYOEREDSLIAMrWSVCMAIGII 
F t SRLPTRICEL INFLFCN ILUAn-pSDLYSLC r FOtXVLCI^A/LCHTRFLALCFDERYTA 
LNHCrr/OLW^TLUVLTA rXTVML I YVMOT r LHLSMLVLPVA I ACRFSYIWrR IMF r SVL 
LNtLC.riFnC rc r AYCLDFPVnPTISLLMGOGYTASLCVKKRYNPSTPSPVSPEIMTNV 

'•Iti_nS44 o»»S*»S i09525 

yhf/Z-Trr* tiiniiittu pruretn 

K. :::VP :^ ; I K:;:;FFt:LKKDKNV IMF-yiV ITLELRACKCXJNCJWAWRKEKYLPKGGPYCCN 
. U :n< 7 :;:•/ r l FATT:nA-;:FKAVRN rP FLKArtyiOiX^ATIINnTnRnGKDL rVSVFTCTLLRO 

Ai-h r t J lonvmEiu.Lv: xjf >^Kf /;K(.:ntffkt.';vhraptkatp^;k pce t rqveleucl 

I Al I U :Fr K^V :K:rrt.l'HTI AMTEVKW^AYPFTTr^XrrjLCLVlX.-KORLVOKPWr iadip 
:: U-JIAHmNKCO :i.nn.«ll n-.hTL.LLU'V [DVSKnEr'iJSPEEDLLTLtUEUfrJHOPDFEK 

r I 'Ml .VAi lit: 1 01 Ji .1.1 -1 U'; ;r-OKKF p;:r cr-VL r :r;i;rt ;t:( ;vu :l.yrfftorlav 

II oM' I :ki'7mai tK K' :a: 'Mt:nn:;y: : kku:7KVi w '/jy.*/: rr^ ;: : 1 1 .vl^(;Itf rrpv/tiPAONVG 
I" 'Mi A/vi ,i-Ai .V) I ; I vv'MKK'i-HirrY r:n/\*rtXfL 



ruT-LJl Pibosnm-ji Pncfttn 

LJK0RLTL5:ERF,lflKK£>»ePYAVI0Tp3K0Y0VRcKSV;DV-E-ai^ 
FVFIXrrKASU:3errrANA0VXAEYLSavamW^YKYKKBKN^ 
EILI 

CPn.0S4':' -;3l5n'> 632199 

yabB Earn: Iv 

tlALKJ LK FNOK ZJKVAZTliiljJKPKF LCKUmVLRON i As^ VMN LT i'TZlZ ITAT^SGw 

FcccDGvocFC/:.r^r/co 

CPn.0548 633234 632191 

cysJ-Sulfite Reductase 

KKYLOEKFKACWVPLVlJlEU^CSDSINOSDPIYRKVrDSNDCrirnCVaJ^^ 
KEVSEHVWr^GYSPTTLVNVKKTSEKVSAQKFrOCYVDLDK : PAKUJSFFPDKOPItlTL 
YDAIOEYRPOI P r ELFAESVTPLLPRFYS lASSPDLHPKSIEliVXHVSYPCKYOKRFCV 
CSSFl^SELaV?roSAYIFV0PTKHFTLSTC3TECKPLVMrGAGTCIAPYKAFl£ERLfNKD 
PGNMU^FGERKEKVNnnfRErVfNHAEEaaCUCLFlJtfSREROOKV^^ 
KAYEECCFFrvCCRKVU:iEVKHALEEILCKDTLASUlK£HRYVVDVy 

CPn_0549 633662 633255 

rsiO-SlO Ribosofnal Protein 

P0DVQHOPW^K3HSUJlFtJCKFKKRIiRSKOCMKQ0KCK:RIRIJCGFIXX:QLD^ 

TAKRTGARVVCPIPLPTKREVnVlilSPHVOKKSREOFEIRTHKRLVDrLDPTCICriDAL 

KKLALPAGVDIKIKAA 

CPruOSSO 635688 633580 

fuaA-Elongacion Factor G 

LrTCEKNKrMSWEFDI^AIRNICrKAHIOAGKTTTTERILFYACRTHKrCEVH^ 

DWMAOEOERCrTITSAATTVFWUGAKINI IDTPCHVDFTIEVERSLRVU3CAVAVFDAVS 

CVEPOSETVWROADKYGN^RIAFVNKKDRKMYFAAVESMKEKLCAhUrPV^ 

OFVCKVOLISOKALYTLODTLGAKWEEiaXSEDLKCRCAElJlANUXEIAT^ 

KKVl£DPD5ITEOEIHOVMRXG^^£NKINPVLCCTAFl<NXCV0QLI^ 

NIRCINIJCTDQEISI^PRRDCPIAAIAFKIKrDPYVGRITFIRiySCnjaCCSAILNSTC 

DKKERISRLLElfiWNERTDROEFTrcDIGACVCLKFSVTCtTrU^^ 

VIP<AI£PKSKGDR£KIAQAI^SLSEEDPTFR VS lW a ;iWl T ISCMgELHLOILRDRMI 

REFWEANVCKPOVSYKETFIVSGNSETlCYVKOSOGRGOYAHVa^lSPNEPCKCN^ 

KrVOCVIPKEYlPAVIKCIEEGLmWACYGLVtJVKVSIVFCSYHM 

MAVKXWTRKAKPVILEPIMKVAVlTPEDHLGEVtGDLNRRRCKILGOESSRCKAaVNAEV 

PLSEKPCYTTSLRSLTSGRATSTMEPAFFAKVPOKIOEEIVKK 

CPn_0551 636174 635698 

rs7-S7 Ribcscmal Procein 

KYMSRRHSAEKRDIFGDPIYCSVZLEKFINlCVKMHGKXSVARXmSALERFCKXL^^ 

VI^FX;EALE>lAKPILCVRSRRVOGATYCVPVEVASeRR^CUHQWI 

VSLATELIOCFNKOGATZKKREDTHRKAEANKAFAHYKM 

CPn_05S2 636698 636219 

r8l2-Sl2 Riboscxnal Procein 

I0ASY\^5SS£NKPLFTXRALLYZSHLVVVRLKR£EYHPTIN0LIRKRRKS5LARXXSPA 
UJKCPOKRCVCWVKTKTPKKPNSALRKVAWVRLSNCOEVIAY lOCEGHNLOEHSIVUO 
OGRVKDLPGVRYHXVRGTLDCAAVKNRKOSRSRYGAKRPK 

CPa-b553 637753 636812 

No robust homoiog present in GenebanK/EMBL as oC 11/7/98 

GCMWRVVUlFl.IIFrLGRAVFPLRASESFSWETSTCLTVTX;iPFIDi:LTrNEDrVAQCC 

LQICTISSmiAKIKEIFLIYKEKFPEASISFXRKEPLNI^QSHI^OLCZLCKRNSenrA 

BCMANKEWGPALKOPKDUlLVLRCP^«^PryrtXYSEKEAEKGI^r^^'CLCN0GYT^^ 

ILYCDS IEKFIJCE^^aU(N^tHTLVDLCDSQVVTITDGRFWSLLNYVOVl*F^^ 

I PDIAOATOIX£}nVPLLF IYT^^>SIHI leOCKESSFTYNOOLTEPIUSFLFlGYI^^ 

EYCFNCA0S5LCET 

CPn_0554 637806 638141 

CT440 hypothec leal procein 

VFSYUXCI ILVYVRFMYECKSRMASPTPGQLHLQOKVESKAYDYSRSLAMIATALLFFI 
VALILSGLSU.POVFLPPSGAYFIIGSFLAFIALCILLINCVCDUCOYLTSS 

CPn_0555 638298 640241 

tsp-Tail -Specific Protease 

MFVHKKL\mLCVVLI^U-PrAfT*FSSDUJlEBCIKKKKDKLIEYHVDAOE\OT)ILSRSW 
SYIOSFDPHKS^'l^NOEVAVFLOSPCTKKRLLKNYKACNFAIYRNINOLIHESZLRAAOW 
RNE>A^PK£LVL£ASSYO tSKOPMOWSKSLOEVKORORAUXSYLSLHLACASSSRYEC 
KEBOtAAIXUlOrEhmENVYLCirroHGVAMDROEEAYOFHIRVVKALAHSlXlAHTAYFSK 
OEALAMRIOLEKCHCG ZGVAOJCCOIDGVWREI IPGGPAAXSCDLOLCDI lYRVCCKDIE 
HLSFRCVIJXr-RGCHCSTWLD IHRCESDHTI ALRREKI LLEORRVOVSYEPYCaxnriGK 
VTUiSFYECENCVSSEODLRRAIOGUCEKNIXCLVLOIREOTXXlFLSCAIKVSCLFKniG 
*/WVSRYADCTMKCYRTVS PKK FYDGPLA ILVSK5SASAAEI VAQTLODYCVALVVGOEO 
TYCKGT rOHCT rTCDASODDCFKVTVCKYYSPSGKSTOLOCVKSD t LIPSLYAEDRU3ER 
FLEHPLPADCCDNVLHDPLTDLDTQTRPWF0KYYLPNU3KQETLWREMLP0LTKNSBQRL 
SENSNFOAFLoO IKSSEKTOLSYCSNDLOLEES INI LKDMILLQQCRK 

CPn.OS'i'; O40921 640325 

crpA*l5icDk.i Cysteine -Rich Protein 

ENCMSSN[J<r\*OCTCTt;AAAPESVLNIVEEIAASC2VTACL0AIT5:irGKVNLLICWAICT 
KF tOP t PE3KLF0CRACC r TLLVLGILLWACLACMF IFHSOLCANAFWLI I PAAIGLIK 
tXVrrSU:rOEAi:TnEKLMVFOKWACVLEa?LODQIt^^NKIFCirVKTBrWTnRATTPVL 
NDGRCTP'/L;'rLV:;K lARV 

rPri.OS*".! M-KIMV) Ml I'M 

E I PuiKL I RKwry u\t ;r: ;M/VXFA:y r haavaeslitk ivA.';ACTKi'Ar'vitf rAKKVR 

I.VHRNKOI'VEOK. ;i<i :AI •VPKKFYI'CErf iW :vi'VF-AOOEr:CYCRLY.';VKVNMX ItVE irO'Jl 

vpf^ATVA ;: :rv r 1 1: 1 1 ^\ u :kk 07/r>w nf/jur*:KAEP/'jr.Dfm^rr'M* ;klvwk e dh t. 
t M iDKCK ir.WKi'i .K M u '( *hTAATVf rACPEumr/TW r*OPA [c r Kora :rT< :Ai:iMi:(^n • 
VK ( ewTfTT^iA I ARfA-rvrtNPvrr/ :Y.':nAri xjHVt j:r-"NLGOMPr':PKKVFTVKK:por'Rt; 
V I TNVAvmw a '.\\Ki:.:.\NvrrrAtF.\''V*j\ni irj wvowr'.YVCKPVEY:: r.'r.r::Nit luij/ut 
bw I oi/rLi': * :vTvi t n :i: I ' .-r .riK'T-ArfM ( K r>«;F<;n:rrjOFKLWK A(^»vi ♦ iiin'NWAV 
•rr;E:JNfrrr'.-r:VAii''rrn tww :i juvntMr.vi jTrriDP icviEnrv^n ti'VTNRf *^:Aia/rfn/:: 
I.I ijck::keu^f I A;::-:r-rK. rr r;:':rfrr/KrjAi.('Ku;:;KEi-*VEF:;vTi,Ki; i Ah awKiPAi 
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u ; crrr .t; : : v::nTQmrrf 

•jm*:A-*ikaj 'Vrtcnne-Pi-.-h Lipoprotein 

KLJ1KKAVL[A/\MFCCV-/-LJ-CCRIva:CFE:DrcAPSSCNPCEVtRKK£P^CGCNACCSY 
7PnCSNPr:C,-TEr N.-^Q" POVX-DCTC POCRCKO 

,j:;r,-: :.K:;K:ii: :L::r ;.: r :;:.r.L;-.v.: = r:-»Tu.;v:L-\v:f'r: :;/mkau.;l:: 

FFLAR5VFT1TCYNTML 

CPn.OS60 'i4S666 644098 

qttX-Clucarayl-tRNA Syr\cnecase 

RNSRFOGMKSLWSKDKRIMNWENVmWVAPSPTGDPHVCTAYMALnJEIFAKRFKCKMIt. 
R I EDTORTRSRODYEEWIFSALfiWCG lOWDBCPDVCCPYGPYROSERTKIYOCYVETLLK 
TDCAYKCFATPOELAEMRAVASTUIYROCYDRRYRYLSPEEVASREAACQPYTIRLKVPL 
SG ECVTEDYSKGRVVFPWADVODOVLVKSDGFPTYHFAIWIODHLMGtTKVlJlCEEWLSS 
TPKHlXLYEAFGMEPPVrLHMPLLl^PDCrm.SKRKNPTSIFYYR0SGnfVKEArVNF^^ 
MCYSMECDEEVYSLER I lETT FNPRRICKSGAVFOIOKLDWMNKHYUWECSPECXIJCn^ 
CWIXNDEFFLK r LPLCOSR ITrtJ^INLTSFFrsCLLEYRVEEU.POALSPEKAAILLY 
SYVKYLEKTTOl/rKETCiXCSKWLAOAFNVHHKKAI I PULYVAITGKKOCLPLFDSIEIL 
GKPRARARLVYAEKUjCCVPKKLAATVDKFMOREDFEEATFDL 

C?n_05ol 646407 645871 

euo-CHLPS Euo Procein 

LMACEOHECCYEtEEREEI EDIKDSOTKWVSITOAAKLHNVTRQAIYVAIKOKKLKASKE 

trweidikdueeyxrnrysrkkslyogelvfengkccys inqvaoi LG I pvokvyyatrt 

CTIRCERKGAAWVI HVSEI ERYKNEYLSKOAAKKLKGAEPKEHOAPNFEPPTEIFPESN 

CPn„0562 648051 646918 

•CHLPS 43 kOa procein homolog_l 

NYK\^KSIAIAREOYAAILr»iHPKPSIAMFSSEOARTSWEKROAHPYLYRLLEIIV«WK 

FUJGLIFFIPLGLFWVLOKICONFIIXGAGGWIFRPICRDSNIiROAYAARLFSASrOOH 

VSSVRRVCLOYDEVF ItXH-flJaPNAKPDRWMLISNGNSDCLEYRTVW 

ES0SNILI^^^^PGVMKS0CNITRN^AA«SY0ACVRYLRDEPAGPOAR0XVAYGYSLGASV 

OAEALSKEIADGSDSVRWr\A/KDRGARSTGAVAKOFICSLG\A*tJ^T^^ 

LHCPELFiyCKDSOGNLIGDCLFXKETCFAAPFLDPKNLEECSCKKIPVACnrGU^^ 

SDOVIKEVACHIQRHFDN 

CPn_0563 650113 648293 

recJ-ssDKA Exonuclease 

OYKNLLWOFSPKGPCGIKFKTNSDNASAAGLLWAHPKEDPAFLGMI IKEFHLPPTVAQIF 

ISRGFOTI0EIHKrLYSHI^SLYDPCLFU>lSKAVERUJJUU3RKEHVMiyC»SDVIXafr 

GVALLVEFLRDIDVHVSYFFlJCAIIJCOHGETSTLIAKLKEECrrU.ITVDCCITA^ 

OITIWlDVirrDHHMPTGKIPHCVATU/PKIJU}HTYPNRELTGVCVAFlCl^^ 

SRNL\^KS0GSUCKUJ3LVTLCTITDVCVIXGENHVMVRYGtKEX 

GVEKSEVTSTDIVUCIAPKtNSl^RLDDPAXGVELIiTODDERVnALIMELDNr^ 

IEAEVFCOVOEIUJSNPEILK0AAIVI^STAWHARVIPIISAIUJkKTnmPV\a 

IGKCSARTIGSFPLLGVIJCKCSSIXLSYCCHOFAACVaMICEDKN^FKKKF^^ 

KCIn^PHL£IDAYADFDAIDYDLIASMELFEP^GKC^n/^PIFYSKVROVRrPiCVLPGNHL 

KLYLSQK£RNL£CVAFCLGRKA0AIJCASVmYPL£IAYTPRI^QTSG5GVIHIX.Vimra 

SEPRFSD 

CPrv_0564 654359 650145 

sacDfcsecF- Procein Exporc Proceins SecD/SecF (fusion) 

SGAMKQKVKRNFAII rCVFALALYYVLPTCLYYAKPLDKKIDGNEAEH I IKSFTKOAQGV 

RKDVIPRVSAILSSLHLRGKIOOHPAIPDXVSVRFKRCEDAEDFIGNLVHGEPNVPIKSA 

RLH\AWfSREHDDHVIOVASSrNTSLVESDFSrVSYSSDJEOEMASSILORVYSACrTPK 

OKEDCSCSYPS IWETAPKEOLLOYAKNLSSGFEVFSSRLSAFCOOSFSSNODRLAFLSRLS 

SLSmiAAI DVEDQKIXKSVYETi^OTACIRSLJr PYI ECLRUXrSESSLFFSSIE^ 

RKIFLTWSDUAJRTSt^KEORIJSFDSRIJlVTXOKl^Krn^TVQVraJVNrC 

TOGKI r LOGERIXOG lAEHLTALTLHRPAAESCDLI PENFPVFCROPRESEAPGCYIFSP 

rn»DCKKFSKCS\nfIIXKGLRSrVAKYOCXXWKELOSFEKDLONLYNCrSHTEArSWTLGE 

OOVIXI RHPUJOFlXVWCECrVIGKEGCAFLEVKDIODRIATVWI EKNROSDLVRWH 

YRHAKCSMDL0ERL3AP I PYONLFLENMKIiWRKFSRCENILRLG I DFVCGROLLLSFKD 

H(X}KOLTDKEDI UCVSDEUJVJU-NKLCVSEIEIJIRECDYIHLSVPGSST ISSSEILCT 

MSFHWNERFSSYSASRYEVORFLDYLWTTSOAOGFCTSPEEINTFASALFNEEVDVPPSV 

HEAITiOJCSEGIAFSPSGCETPSTOLITITFSMIAlGKDAEOKANPLVIVFRNYALIXy^L 

KDIRPEFAAGEGYVI^rSVmrSPKKMAEKLSPTESFHTWTSAYCOECISGTANGOYSAN 

RGWRMAWTIX;YMVSS P I LNVPUaWASVSCKFTHREVSKIJ^DUCSGAMSFVPEV^ 

T ISS0LCKK(XrrOGI I SACCGLAMLIVIilSVYYRFGGVI ASGAVLLWUXI WAAW 

PLTLSCLAC rVLAMCMAVDANVLVFERIREEFLLSOSUCKSVEKCYTKAFCAIFDSNLTT 

•/r-ASALLFFLDTCPIKCFALTLILG I FSSMFTALFMTKFFFMLWMNKTQHTOLHMMNKFV 

G I KHDFLRGCKKLWAVSCSVTLLCCVAUSFGAWNSVLGMDFKGGYAFTFNPKEHGISDVA 

OHRGKWHKLOEAGLSSRDFR ICTFGSSEKI K IYFSDKALSYTKAC3TSLS PKINDHEXAL 

AVCLLSETCLDFSTErrLNETONFWSKVSSKLSKKMRYOATIGLLCALAIILLYVSLRFEW 

OYAFSAVCALIHOLLATCAVLFrAHFFLKKIOIDLOAIGALKrVLCYSLNNTLtlFDRIR 

EDROANLFTPMHVLVNDALOKTFSRTVmATTLS^/LLMLCF ICGSSVFNFAF IMP IGI L 

t/TTLSSLYIAPPLLLFMVRKENRSK 

CPn_0S6S 655741 654533 

CT44V hypochecical procein 

f «LFCFLI FCFVN I SAI tJ-DSSFLLK IKRNSKRMLP.SMKFPR r SISDLI PTOMVrwWRGG 
GNVH WPNAONLPKK I LCCVLACFCLALLCCAAFAAGVCOT I FPC ICLM I UJLVLLCFAY 
t^YSKGWSRFERPLFRETKVFEKP tNWLCCLStXOSWKK IRPCCYYHPCCPOVEICECSO 
EIVTK rFOKKCDRNTSIFL rOEMDOIALROGIEKSSLSRKTFAIDPSWSSLLSEIOREE 
OOY LOPKV I GWriSEDOAJORTHPKGA I YVNISDAAOEPOCRCYIDAYTKAFFTVLDOICD 
1 -N rVKKI rr t YVLTP t LCVPDALPKEEOENLKLLSOAAFLYSAEOVAKRMREEKODS IRIK 

FtrTDrT:;fTr:LYF:;n«i::rTrHsvTpi5LSCFvcErjE3*rrFA 

•;Iii_(iSm. hSi, ()•)•» ti^f.HOi} 

y.n*;: t^imilv 

r vrrAu t Mhr-7Du:i atnnae.':kfp: :lorli'nhva r iMDCNnRwYKKiiREECCHTHTr; 

IHYY' JAKVt.rnn JJAVMUr. IKVLTLYTF.-rrENFTjr'.pKEEIOE IFNI FYTOLDKOLPYLM 
lilKUXIf • I i ;i/t^:Kf.rKc :KrrK INHV::RMTA;;F:^P.LELVtAVNYOr;KDELVrtAFKKU4V0 

n j'iKK I ::;:fJiJt.';f%::i.t ::::Yi.ar:*-'';LTnP0LLirrTTy;o4Rv:;MFLLW0 tAYTELV ITDTLW 
iiti'Vi VI ') .i"KA I rfVY\.vi*:nu« > :k 

''ln_»c.<,/ »,',«M»/t t.S7Hl7 



VLNSNKFKGKr^AY'^OLFOP*- .SL\''-TFLVLLLYJJLrrLT:;KAUX:TAr:GAVCTY 
EYSSMAKAKMH'rPLJTFGA r^GFLFLALJr:^ I BUnH.'TLPrJFFDAI.PWrLL :* AiAA/WS IF 

RVRKST icalolxvtlt:; niYvci n t BaxuivLynir craePYLruiWWiPff' lOTfXGA 

0 r FCYFFCKAFGNKK I A'PO 1 3PWKTVVCFVACCUUTL IZFl FFIC: FTRFASt PtWA I 
L I PLGUI/: IT^FFCD 1 1 EG : FKRDAHLKNSNKUCAVa:.MLOTU)r;LJ:-:^P lAYLF^ 
TOSKEFIC 

rPn_05'>« •;57f?0S SSR4t>4 

EEPPFSFTFATOPL£5Frrx;HU.rS£:.7Tt:EVANAAatUiCLPEVRAFMUDL0RlllfAOI. 
CrO^ECRX^CSKVFPNADLK : FLT33P EVRAORRIJCCLPECTLSPEOUJAELVTCRZJfc^ 
AORAHOPLVI PEHG rv I0S3CLT : ROVLEX I LALLFRNEL 

CPn_0569 658398 659099 

pi »C -Glycerol -3 -P Acy It ransf erase 

tJx:rwKTSscENFsrrrsKRAMiFRrcKFrrwAFsuTKucvYcvKXNrr)C^ 

NHNSFUDPIAUiMCVWEC:'rHLARASt*FNIPWl>«OVJCXFPVROOEC^ 
KRKKLVIYPEXURSPIXX3WPCK^raIG^WAAKSRVPIIPVYIRGTFEA^NRH0KIPHVWK 
TITCVPGTPMYFDDI lONPEIKUKETYQI ITNOTONKIAELKAWYESCCKCDVP 

CPn„0570 659044 660789 

argS-Arglnyl CRNA Trans t erase 

TKLPSSKHGWmGAKETSPKLMSTU^II^lCSOAIAKAFPNLErWAPEITPSmKrO 

HYOCNDAMIa-^RVLIaCAPRAIA£AIVAELPOEPFSLIEIACACFI^^^'FSPV^IlJW 

FKDAUCIjCFOVSQPKKI IIDFSSPNIAJCDMHVGHLRSTI ICDSlJ«IFSYVCHOVL^^ 

ICOWGTArCMLITYLOENPCDYSDLEDLTSLYKKAYVCFTtroEEFKKRSOONW 

POAIAIWEKICETSEKAFOKIYDILDIVVEKRGESFYNPFLPEirEDLEKICCLLTVSNnA 

KCVFHEAFSIPFMVOKSOKYNYArrDtJUWYRIEEDHADKIIIVTD^ 

TAIAACVTXJPCIFSHVCFGLVLDPOGKKUCTRSGENVKLRELLDrArEKA^ 

LTDEAIQERAPVICrNAIKYSDLSSHRTSDYVFSFEKMLRFEGNrAMrLLYAYVRIOGIK 

RRZiCrS0LSI£CPPE10EPAEELlJU.TIXRrPEALESTIKELCPHFtTO«.YNLTHKrNC 

FFMSHIQDSPYAKSRIJirAIAEOVIATGMHUiCUCrLERL 

CPru0571 662179 660749 

BurA-UDP-N-Acetylglucosamine Trans t erase 

TFEKVNVSFSDFnAKGERRKJIAQVFGCGRUWEVKVSCAKNAATKIXVASt^^ 

RN\^ICDVSLTVELCKSLGAHVSWDKErEVLEmPEIOCTRWPTFSN^ 

AIXGRCPEXWmVCGDAIGERTUrFHFEXaJCOtjCVOlSSDSSGYYAKAPRGLKCNYlH 

LPYPSVGATD*LILAAIHAiCGRTVTKNVAIXAEILDLVLFWKACADITT»ro^ 

TCGU;SVl»rriU'DKIEAASFCMAAWSGGRVnTOaAK0EIXIPrUCHLRSIO0C^^ 

SGIEFTOERPLVaGVVLETIAaiPGFI.T»^PFAVU^AQGSS\aHCTVHOmU^^ 

LOKMGAECOLFHOCLSTKACRYAIGNFPHSAVIHGATPLWASKLVIPDlJWCrAYVMfcAL 

IAE0CX:5IIENTHUJ}RCYTNWGKU^U3AKIQIF0KE0E£LTT5PK5XJUi^ 

CPn_0572 662349 664616 

CT456 hyp ochec ical procein 

IMAAPINOPSTnyiiaiU/l i i'l I'i VGSLGEHSVTrTCSCAAAOTSQTVTt.rADHDCE 
lASOOCSAVSFSAEHSFSTLPPETGSVGATAQSAOSAGLFSLSCRTORROSEISSSSOCS 
SISRTSSNASSGCTSRA£S5PDLCDIJ}5LSGS£RA£Z3AEGP£CP0CLP£STIPtfyDPnnC 
ASIUiTIJWAVQQKhCITOCHFVYVDEARSSFinnWGI^^AES I 
PArtt.EMCIAKFCV3YETlHSlW^CKVKPTMEERSGATC^^m^ 

KESSSSyTPSAWRRGAXVETGPIVrDOVOGLKGINWKTTPAPDFSFINEITaCGAKST^ 
CP(nTVGATV\a>NVWNlgGrKVDLGGrNLCg 

rrSTGSOSTIEEZTTIOFDDPGCWEDDNAIPGTNTPPPPGPPPNLSSSRLLTISNAStNOV 

LONVROHLrn'AYDStKaJSVSDUCDUMVVia*SEJJG\wrPTVILPKTT^^ 

VTBODGHIRNIIORfmjSTGOSECATPTPOPTIAKIVrSLRKANVSSSSVI^ 

TPOARTASTSTrSIGTCTESTSTTSTCTCTCSVSTQSTCVCTPTTrniSTGTSATTTTSS 

ASTC3TPOAPLPSCTRHVATISLVPNAAGRSIVLOQGGRSOSFPIPPSCTGTONMCa«3IJH^ 

AASOVASnjCOWNQAATACSOPSSRRSSPTSPRRK 

CPn_0573 665413 664691 

yebC family 

\ra«HSKWANTKKRKERADHKKCKIFSRI im.ISAVKIXXyU5PKSMARI^^ 
ENNIPNENIERNUCKATSAEOKNFEEVTYELYGHOGVCI IVEAKTDNKNRTASDKRIAIN 
KROGSLVEPCSVLYNFARKGACTVAKSS IDEEVI FSYAIEACAEDLDTEDEENFLVICAP 
SELASVKEKLISOGATCSEDRLIYLPLRLVrcDEKIGEANlJaiCWLEOIEtnTODVWm 
S 

CPn_0574. 665978 665394 

No robusc homolog present in Genebank/EMBL as oC 11/7/98 
SAERGFRHPIVMVETVLHNFORYLSKYLYRVFRFPCRKKTFLSSHRVlARPSFPVOyCPC 
K lYDLQEIYEEUlAOLFOGALRLO rCWFGRKATRKCKSWLCLFHENBOLIRIHRSLDRQ 
EI PRFFMEYLVYHEMVHSWPREYSLSCRS IFHGKKFKEYEQRFPLYORAVAWEKANAYL 
LRGyKXRVOGG\'CRA 

CPn_0575 666524 ^65982 

YhhY-Amino Oroup Acecyi Transferase 

S I FCRVWRSFOTAEKO^m; I UILE I RYTLPSDATYMUCWLNDPKI LRCFP lOTEAEIRET 
VKFVWCFYRYHSSLTAVN-NCNVACVATLVLNPYVKVSHHALIS I IVCEEFRNKG tCTALL 
fWLIHLAKTRFKLEVLYLEVYECNPAU^UYORFCFVEVCRONRFYKDEICYUWCrTMEKO 
L 

CPn_0576 (;6754 3 fif^^OA 

prEB-Pepcide Chain Release Faccor 2 (nacurai UCA frame-shifc I 
MOENLOKRLEALRTEISLAARSL 

CPn_057ii . I pp7S'*^ 

pr tB-(n<icutal tJGA c:ante'Shitr. ) 

MOENI J>KRLEAl.RTE t SLAAR:: L 

CPnJIS77 U-^YVSrf* i*..;Hif.. i.r.HlS'i 
lWIO iYM74) i;umpU>x pror^'in 

Er:it4.':0KMKN;:AFMlirVN t:rrDiy*V I 'AJKf ;mPnTE r VrKVW^^' IKKIIfK.VPOKNKRNI L 
HGANI^Vtt ;:T::pj' tnMfVfrrKAL;;KH I VK 

ClTiJ>S7fl ^fe^JS^Ti'i.H -.^r* 
y.jir) -phO!:t>h^itiytltOl.ii:<: 

mn:y r vl t i ::LATLr i L/\K: :v/a:;i.' j Kf -kwi .ptpa i r-wR t .pkki (ai i u h uj< t Ay i :;oiJ i 

FtlKRVPEKFLNKV::KrTIKNF:;r-rvI.rVF'y:i^|.I/:i/ARLFXpKERLETFLrrri£AriiVFA[I. 

'»«iuy:i::y triRNTKCK m* [ t't:f:K::H i* cjI'A i i AVMf/;LF:;nr*::vKYDrtii.Ti vt^nipiu , 
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LKLrJtNTrUTLLHNTTHV t POTLNt VCCjCDLFABu . -f^EOXTKNYOPSL^LLLSHNPOC 
tTRUXjrrCDFVLSGHSHCPOVTLSWPKFARKFFERt^UWPYUUlCr^r/rKECKOLYV 
NRCmCUR I RFCSPPEICY ITCSYD 

CPn_057'> hSI^lO «69993 

yqop/y.tcM-r^uadr Nucieotide Phospnocyldse 

KEF*\3APLXK-ATnHVPH t KS3LI U.GCCKXn'RFCSKt PKOYLPLNCTTF L'/UiSUCIL^ 
:.P0 rAFrvrrrDP.'^YOn'FOEYPVSFAt PCERROOSVFSGUXJV.T^PW/ : IHDCARPFIY 

■ :.:.:::Ai;K; :^■•\u\;l r.■•rrJy':^^rs\■^r:.:^vl^t:'^lv.:■:•: . :^2:LP=r:LA 

CPn_0580 669936 670793 

cruA-PseudouridyLace Synchase I 

ASSNONFLPRRSNDCPSPPMrKVArXIAYOCTAYSGVTOOPrTOLSIOEVIESSUacrTKT 
RTPLIASCRTa^CVmAYCOVAMFRAPDHPLFAKANLTXKAI^ILPKDXVIROVAl^ 
FHARYLAIAXEYRYSLSRLAKPLPWORHFCYTPRHPFSTEXMOEGANLLICTHDFASFAN 
KCRDYWSTVRTIYTLOIVDKCOSLSI ICRGNCFLYKMVRNLVGAIiOVCKCAYPPEHU^ 
X LEOKNRREGPSAAPAYGLSLKHVCYSSPYNNFCCEOCSVSTSNEC 

CPn_0591 671533 670745 

Phospnoglycolace Phosphacase 

ECUlWRSVKSFLROCWIYSKLVSDEFOLCUlSGfflrLa>YDVFFFDtJXXLVin^P^ 
Fl^ACAEFSLEVWTOFSTYYSHTTLCrrEIFSKKFIEOYPOAQEWAEIFAKRIOrYYXS^ 
E^lAGPAU«PG\rt:Af r ELVLSLNKTFCWmSPRDATHTUlTW 

PKPYCDSYDYAYRTFARBO0CVIGFEl)S\nCGUlAl^KIPATLVCINSMA£rTPEDYPELK 
GKEFFSYPSrUVLTEHCSQQKLL 

CPn_0582 671305 672177 

CT465 hypocnecicai protein 

KNPhUIXKKIOHRLVKMHDKNKVLYWArnO/IQKRKRHNPLNTYHSSK^^ 
SNrVU<MrLRlSTVSIiTSCSFSKNSRTCrVTPERITSOKrcPVUJiPKSTTISPPLYCW 
ISPNREVITAYSFYCRCOGKSI ITPECVLYDCDGLHHSITKEEFHYIHPRLIEWRLLQQ 
DHPKVS IIEAFCCPKHFHFLEASC ISLSOLKLOTAATFALOPPLPMEXUATIKKLVKK 
MSDPSLSNFIVTEATLTNPEUlLTQODljCSHTEITVEiriJNLONKEALSSA 

CPn_0583 672349 672717 

CT466 hypothec ical protein 

IVI^FFlJ3KTK\n'PRFL«NERTm.IXKKKKCLFlAILDLT^^ 

IFLSCIDRVDWIKEFRHAFSSELPODIQEELEEIRinaiRILDTDKRNYAQKKKEFCIY 

ERP 

CPn_0584 672659 673798 

acoS/ncrS -2 -component Sensor 

IRI^UTMHRKKRNLVFMNVPDSKNU^PPAYELLEIKARrTOSYXEASMLTAIPDGII^ 

SEroHFLIC:NSOAR£tLSIDD^l£IL^mS^^DVU»OTCLGFSIOEAI£SIJCW 

CKESKEXrraJ-IRKNEISCYLTIOIRDRSDYKOLENAIERYKNrAELCaOfrATIJU^ 

NPLSCrVGFASIUOCErSSPRHORMLSSI ISCrniSLt©ILVSSMI£YTI«QPLNUa 

DFFSSLIPU^SFP^ratFVREGA0PLFRSIDPDRMNSVVWNLVXNAVETO/SPITLTl^ 

TSCDISVTNPCn'IPSEIMDKIJTPFFTTKRBCaJGLClAEAOKIIRIJiOCDIOL^ 

SFFI IIPELLAALPKERAAS 

CPn_0585 675880 673B65 

•similarity to Cps IncA.2 

IsuuucIIJU»^w^sICDCss»UTPAOKSPT^QDPSFVR£LCS^mp^^s 

IARVOQCGWNHTIVKVSLIILAU.TILCGCIXVGLr-PAVPMFICnCLIAICAVIFALALI 

UTLYOSCXILPEELPPVPEPQQIQIEDUWmEVLEGTU^VtJLKDR^^ 

CEKIlLGMlJ5RKLRREEEILYRS7AHLKDEERYEFIiELLa<RSLVAIWaXFN^ 

OGIHrVRSEEGEKEISRWDLrSLOOQTVQDLRSRIDDEOKRCWTALQRINOSOKDIQRA 

KDREASORACECTEMXAERQOLXKOUWOLKSMOEWIOIRGTIHOOEI^^ 

LQEDLRLTG lATOEOSUnrREYKEKYt^QiaXMQKILOEVNAEKSEKACLESLVKDYEKO 

I^KnAmJCKAAAVWEEnjCKOOQEDYEOTOEIRRLSTFILEYQDSIJlE^^ 

OOKYSRWEEKOVKEK ILEESMNHFADLFEKAOKEIWAYKKKLADLBCAAAPTE 

VWrOSASL^OKKIRELVEEiroEmCAlAFXSNELTOLVAnAVEAEKEISKLREHIEBOK 

BGLRALDKMHAOArKDCEAAORKCCDIXStl^PVREnAGHRFELJErafiRLOEENAOU^ 

EVERLEOEOFQG 

CPn_0586 675993 677183 

acoC/ntrC-2-Contponenc Regulator 

KEKMNPSRGEIJMAIKNILVVD0EPU.R0FLSEU.TSCX:FIPDTA£2iLRNAL0MIRSRDYD 

LV1SC3MSMPDGSGLDLI KI I KOSSPHTPVL\ArrAYGS lENAVEAHHOGAFNYLTKPFSSE 

ALFAriSKAEEIJCrn.VHDII^U<S(7rTPDSHPLIAESKA«KDI^IAKJCAAS 

GESGCCKEVLSFFIHHNSPRAWPYIKWKyUVIPETUXSELFCHEKGAFTGATrK^ 

FELAHKCTLUJEITEVPVm^JAKLLRArOEKEIEHLOGTKTI^VDVRItJVTSKRKLK^ 

IDDKSFRODLYYRLNVIPLHLPPLRDRODDILPLANYFLNKFCRMNm'PUCrLSPKAOEL 

LLNYPWPGNIRELSNVLERWILEKrSLLTEDMLALA 

CPn.0587 677378 678124 

•yvyO.Bs conserved hypothetical protein 

SYCELFIt^TLLKHHVTU:DKKRPHRKHVSSKSUUJCOSASTHVEITrKAFRLSMPLKOL 
I LEKSDHLPPMET r RWLTSHKOKLCTEVHWASKGKEILQTKVKNANPYTAVINAFKK I 
RTMANKHSNKRKDRTKHDLCUAKEERIAIOEEOEDRLSNEWLPVECLDAWDSLKTLCYV 

PASAKKKISKKKM5IRMLS0DEAIR0LESAAENFLIFLNE0EHKI0CIYKKHDGNYVLIE 
PSUCFCFCI 

CPn_058B ri78033 678526 

CT'A6^ hypothetical protein 

TSKS IK3NAF IKWrn-ATMSLLNLPSSODSASEDSTSOSO rFDPIRNRELVSTPEEKVROR 
[^FLMHKLNYPKKLt 1 1 EKEUCTLFPLLMP.KCTLI PKRRPD I L riTPFTYTDAOGrn'HN 
LCDPKPLU. I ECKALAVNONALKOLLCYNYC ICATC X AMACKHSOVSALFNPKTTTrLDFY 

rr;LPE"rsoLLNYF rcLNU 

CPii.OSH't i;7«834 •i7'>i05 

tT^'K) hvi.oth»;r. protein 

IMy t'.*'/Tt:vVI.R:;RPU:KNJlTLTPLFTPB:;LrTFFAKO(XyrUXDYRLTLyr I JLCKYT 
UIHNrr:R!TKt;nK:0[LNAFEArKLn^/ALLEA:'A:KM[QALtJV:X>fKt:Kr::nKLF:;LFLNF 

L'A tr; tRKLEEtt I LOA r tMAKOF.':ELLA[AEFP fA t AEKI FYLFD; '.LOEEKK.^ERNnnEDP 
VIlRIIJ>i.:KWIII'Y 

I 1 11.0'; If I t.Hlill,; *i7'iOl*; 

I'M / I riyi«init;r itMl prnr**iri 



LFL'^DMNU^FACRYLFFF I , . wO^r .^NCLL.-n,rs >-'«.:c;i:r,:ri-niRrrF"K^PDKE 
NME lOAORKKRVEFTILTCEFPK LETU /YC^' JF^MLRAKC Ri IVY r,XYAUJF.-CS3CKMCM 

DFRCKWNRG3T tTryCKCTWUCLgKCVCa^ t vmfKT.-^CNtvoig L 'as tf J^/WSftC I 
Yi*NDLvcFSEVTt:FNVT;3Ererr:TFS*' • •■' — ^ 

CPn_059l SfI0364 ti.-*l021 

yag£ canily 

r,:. rMRCTAYCT/\oA-nXLHVLFHLUCPRYPT t LoREY\':-ANL05TCASN0LA I FFPFCVAV 

\; " *: .V.'. "/Kr.TrKKTT : v-*- ■ .: ; • •■ *■ ■-: : . y ■-. .• - - . ....r;,.. -v \jvn" 

HSDlLDEP0FFVfDHpnX3A:YRDV*»iCLC;iARXN\-;,IV * 

CPn_0592 . 681132 6S146L 

yidO family 

tYSKMFSMSFTCRFLCXJ r PVRICU.: : YLVCWt, I SPLUSSCCRFFPSCSHYAEOAXJCSHCr 
LMOCWLSIKRlCKCGPWHPCCICKVPJCrALOEVLEPYCEIDCGDSSHFSE 

CPn_0593 682494 531391 

CT474 hypothetical protein 
VLGAKCMAFKRKmrt>CVLILSVCLNMLrU^ 

VYLSEDFLNEISQASLODLISLFKDERYMYCRPrKLWAI^AIASHHIDITPVLSXPLTY 

TEUCCSSVRWU.PNIOUCOFPVILDYUU:HKYPYTSKCLFLLIEKMV0ECMVDEDtt 

CSTPETLYLRTLLVCAOVOASSVASLARMVIRCGSERFFHFCNEESRTSMISATOROKVL 

KSYLDCEESUWOXLLVHDSDSArtifErCOEDLEKVrRLMPOESPYSONFFSRLOHSWW^ 

LACHST0RVEAPRVOED00EEYWO0COSl>rL lAKRFC X PMDKI ICKNGLtWHRLrraCV 

UCLPAXOS 

CPa.0594 682517 684958 

pheT-phenyialanyl tRKA Synthetase Beta 

NTCHYTOVrVKSLVKTSUU^SMRIPlTUOTfTSEPLSTKErLE^ 

LYSFASVITAKIUfT I PHPNADKLRVATLTOGEKEHQVVCGAPNCEACLIVALALJCWCL 

FDSEtjQAYTIKKSKLRSVESOGMCCGADELCLDELQ lOERAXXELP EATPUIEDrjlTVLC 

NTSIXTSLTPNIXa^CASFUaAREICHVTOANLVrPKFySFg^ 

FSYVVITGISA0PSP IXIX}E SLQAlJCQKPINAIVDrTTryiMI^LGQPLHAYDA5HVAtJ>S 

IJlVIXLOT'ESLTLLNCCTVl^SCVPVVRDDHSLt^UX^fMGAW 

AYFLPEALRASOKIJLPIPSESAYRFTRGIOPONVVPALOAAlKYrLEIFPEATISPIYSS 

GEICRELKEVAIJlPiaiORILGKSFSIErLSOKWSLGFSTTPOErSLLVKVPSYRHDIN 

EEIDLVEEICRTESWMIETOfWSCYTPlYKIJCRCTACFU^NACLOEn^PDIiDPErVA 

tTRiaK£EISUX;S!OfrrVIJ«SLIJX;iXXSAAT^^ 

ETQTLAIIXTnXJESRSWLPKPSl^riTSUCGWVERLLYHHHLSI DALTLESSALCEFKPY 
OOCVLRIKKOSFATtCOVHPELAKKAQ IKHPVFFAErjJLDLLCKMrjacrTKLYKPYAIYP 
SSFRCLTLTVPEDIPANLLROKLLHECSKWLESVTI ISIYQDKSLETRNWWSLRLVFQO 
YERTLSNODIEECYCRLVALLNELLTOTKCTINS 

CPIU0S95 684943 685926 

CT476 hypothetical protein 

RDY0FMKOUJX:^A:V^AKSCSAyASPRR0OPSVMKETFIWNYGI IVSOOEKVOT 
TKW^WATLHEVYSCXKXHGEITLTFPHTTALIW^ X VDQGRI.VSRJaTFVNCLPSOEE 
L7NEDCTFVLTRWDNNDS0T1TKPY 

W SSWIU^ iuLTbNblJVMVKYTTrrglfflDPEStTHYQNGQPHCL 

WR^tO^ OIAj I'l' IVy iwj^iCTSEI AYVKCVKEGLELRYNEOE rVAggyghmMnFT^^g ygp y 

AGGXOKHWyRCRSVSKAKFERLNAAG 

CPn.0S96 6SS930 686457 

ada*net hy It rans f era te 

FAVMA017rLrPKIi«NSLSOACSBCU*rAKYPPLOVIVHFONNLVWliLSVAPVP^^ 
LGPAAHKAMOEIVLlCSRYArnCEHPPFSSHFAKDLXPSOYLEIljgcVAE I PFCTOTIYAE 
lAKKTOTOPRTVGAACKQNPFU^FFPCHRVVGSKCERNYVLGPVIHElXX^^ 

CPru0597 688215 686479 

oppC -Oligopeptide Permease 

MOKHPSFYORFI^YYKNLLASLSWKFFISVALIGIYAPLFASSKPLLVTWKGEIFFPLL 

RYXJTPCyYTKPVDIJFNVlilvrrFPFFILSFiaTRCWIJUlWLLCLCIISOCMIFAWAYSC 

KVODPAlA£NIJCKMRAEKVR£NISKVNSD<VHLtPKryrRTWEMERR 

KYRKKOEASVKKYOVAFEEKROSPMPTLJWLEMKNECICUaiLOORVDKMORPYE^^ 

WNRATDrmiPFLMALTOIEHEIJUJU3YNNWCQPEDLCrAYA^A™t)lgW 

LEZ»AKLRSAISFIODKRI>/rEKESEDtJlXLINPFFSSFHWEDDAGGSRE>WKYVPWWQL 

SRVTRKDU-ftAtVFGIRIALWAGrCITIAXJ^IGXKICLVSCYFOCTVtMII^SRrreiWE 

TMPVI^ X Litt.VXSITOQKSUXrrrVLI/;CFSVm:FSRYVRX EVLKORDRCYVLAAT^^ 

SHYVIMVVOILPNAIVPVrSLVPFAMMAMISCEACLTFLGIjGEESSASWCNLKReCVTCF 

PAE5AVLWPPAI I LTNLLXA X ALXCDGVRDALOPRL0D5 

CPn„0598 689712 688219 ^ 

oppB-01 igopept ide Permease 

EEOCSVLKY rURLVXI PLTLFAIVSINrVILhMAPGDVLEEKSR£)ALGEACKSDKHRSY 
KCPDRYlX}FREHYGLTLPIFFTn'RPKITHKKIC3TAWElJ«WJhrrTPSAKNAAKSLVYWC 
DCAKFVMPAIXFE.^rDASRDOKYRH X AACLF XRCCVrtOGFVCPNLSPBORAONKEIAESN 
AFLVROLNEEDLtrrKVEAIJCCWrODHCCTEVFCYSSKOFWiaFFLrniFARYMSRVU^ 
PCTLRNnAHKTV XSEVtKRI^CSLVLS I LPMIVCFVLCO I PCMIMALKRNRWXDHSLNFI 

FLXLFS I PV^/AVPtV^LDNFV^^^CTI pfttx pmpysclrsppevfnelstlcrifdlvsh 
CFtPFCAVS YGALAAOSRLSRS I FLEVX^QDF ICAAKARCLRWFDX LYKHVCKNAAVS IV 
TSt^3tjCTLUX;AL\.VETLniXtCFaiFrY0AIUIROHfWVLFnVLVCSALSLVCyLLC 
D ICYVLUJPR VOLECRR I 

CPn_0599 691823 '>^9«82 

oppA-n I igopept ide Binding Lipoprotein 

KRRESGKMYKRCVLTK I LKC tVACnL I LLYWSSDLLERD [ K5 IKCNVRD rOEOIREISRV 
VKOCX/rrJOA r PAAr\WMU\PKL'/ROEAFAlXFCDPSYPNLLSLDPYK00TLPELLGTNFH 
Pltr.IUlTAIIV(;KrENL:3PFfCFD*AArcr^0U:tP.':LA:;PHVnKYEEFSrDLAVKIEEHLV 
EOG.'X:DKEFHIYLRrNVFWRrtOPKALPKHVyL0EVF0RPHPVTAIIOIKFFYDAVKNPYV 
ATMPAVAI Jlrxnr mV:.V:;VF»DLKLWr'WKAI rr/ TNEBT^KREn KVt ,Y:;AF'iNTU;LQPL 
Pnnrr'jYFANnEK 1 1 EOEN IDTT/PTN:: rWAONPmnWANN-^ I V.Ti tlAYVF-MIMOOeK tVF 
.'.•RNI'br^Dt^LAAl. t PKnFV\TKE.'n'P:-:r.FOnFKTr:K ID t rJYt.PPNORDNFVrTFMJCL^YN 

KuvAK* / y\vR rTV::AUf(AYT\* rT/flArF.'.T.Fr'j: :r"//Rt :ammma I UK rat [ I tX'(:Laxy:YT 
r : a ;I'Ka:;:;::p: ;ynko t ttwnv;: r- kfaam .i .t;Kw .-w t orr / ;r/ ; i nkkv i r> 1 vpkr frlc 

YYVK:rArAMTrAnWA1-At'KKI':iti:::LI/:iJMA0L:i:iA|.r.KKr(f-DAUM**:!/:tPI*ED 
f'RAtJrfII.':Bt;AMKK*;::i\NV\ft;FHItFJvAnK • irt|*U:YKYl)(.KHr'rrr<r.YHRKHKt tMEFJiPYA 

Ki.i -.MMicsLLYKOWKN t KvrTiiKn iL f (icA..-! .r:ryrjVTMVwijo{Kia)u*L:n':; 
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FriSUJCEnrnLNRAKOHLLVK I LRDnrmOHLRCUSUOCE r PHSPCL 

CPn_060l *i?1072 •i9273*i 

CT49J hyporhur. icdl protein 

OFPR IHADD I CNSMDE rTPN"^ P LLRODSLWHR-/RVSWRADLSVSSRYEI ASA :A ILSLLV 
AFCA3AAVG I rFT-VJPLAOVF : DGCLALCLl? I ?LV IGLLI IZ I IVLI-VC I YLFPQQRE 

osgfwpIgVoenLealcncts^ 

yiPAFCLACFFLTtJVCLVTRLYLLSCKCDrFEDLASEYLOGAVPPNKRSONIVEEOSHL 

AAAATKI^INUJNOEYSU-SEIFKFLPKHDLIRKFSCFCFWKDYn-niEClXO^ 

KVVOAIPVDLSAKVSLADAYVAISCLYADPRKYPEFDANYWIPSGRYSAEIOEaCTFATW 

RAIEE^0II-^I^fAPGNAWWAOtAYSyHDL0MPM£EI0EYEIVIJCLKP^raV^^ 

YFOOGMNAKCUtiyEEIKXROYKKSOKLIKTYCVEYKY 

CPn_0603 694136 695X85 

hemZ-Ferrochecalase , 
V^IKRLIVUiOCLVSLFtJVKKVTVTTPAYlXANFCCPRHAKDLOEFLrSlX^ 
LPRVLHRHUFTFI AKKRVPKVL PQYQSLONWS P I YFCTETLAKTLSEILRAPVI PFHRYL 
PSTHEKTLLALRTUfTRHVIGI PLFPHFTYSVTGSIVRFFKKKVPEIPISWI PQFCSDSK 
FVSLITCH IRDF WKLG ILEKECCFXFSVHCLP\mYISOCDPYSKOCYES FSAnTNFXQ 
SENFLCFOSKFCPGKWt^PSTAOLCONIDTDKPNVIVWFGFISOHLErrLYEIEROYLPL 
LftSRCrRALRI PAI YSSPLWSTLVD IVKENSTWAEELIKSGKKHTG IR 

CPn_0604 695991 695196 

fliY-Glucamine Binding procem 

CiOOlONSEAOUJVKIKrSWCVNFLICUAVGLIFFGCSRVKREVLVGRDATWFPKOF^ 

TSmriArLNDLVSEINYKENt^INIVWDWVHLrE^aJJDKKTOGArTSVLPTLE^ 

FSOPILLTCPVLWAODSPYOSIEDLKGRLIGVYKP-DSSVLVAONIPDAVISLYOHVPIA 

LEALTSNCYDALIAPVIEVTALIETAYKCRLKIISKPl^lADCtJirAILKff^ 

AGLVKTRRSGKYDAIKORYRLP 

CPn_0605 696737 696150 

yhhF-Methylase 

LRKLCSSRGDVRILACKVKGKSUCTFSNPH IRPTSGLVKEAFFS ICREDIEGAAFLDLFA 
GMGAIGFEALSRGAASWFVDrSrKAIOLIHTNSALLGEOLPWlFRODAOSAIORLIKQ 
KRSFDLIYIDPPYEU:NCY^^LL0KWSGNIU^PEGTLFtX^ttSDEEIACEGL7UlRRR 
KLGKTYLAEYIVEKDP 

CPn_0606 697492 696707 

CT488 hypochecical procein 

SSYSRROI-RFYrGSWMHIYGLADLHLAIjCWEKT>tEVFCDPWIGYHQKICSEW^ 

E0rVUJ>CDISWAMNLSEAHKDFAFIGOLPGTKYMIRCNHDYWSSASTSKIL0ALPPSLY 

YLWFAIXTPHLAVVrtrmi^roSPTinnaCENFLTPSTQEOSYTEODE^ 

AFAALPKEVTEVIVmWPPISStXrrPGPISEFLEArxaiVSICIJXIHrHKVORPII^^ 

IRGIHYILVAADYVNFVPQEVM 

CPIU0607 698910 697573 

9lgC-Glucofte-l-P Adenyltransferase 

rnWIWIOTOFPEASKFESSHFYRDKVCW'IILCGGBGKRLSPLTNCRCKPTVSrOCRYKL 
IDIPISHAISACFSKIFVIGOYLTYTUXJHIJKTYrYHCVMDOIHLIAPEAROGOOIW 
OGTAZSAIRKNliYrarrEI E:YFLII^DOLY^IMDFRSIVOTA IRTWIKl^ 
AYRMC\aDIDSECKIJDFYEKPQEKEVUCRFXJLSSEDRRIHKLTEDSGDFljCSW;iYIJTl 
RDSU-SLU^XEGNDFCKHLIOAOMKROOVQTLLYNGYWADIGTI ESYYEANIALTQKPH 
A£KHCLW:YOD^KWIYSK^mHLPGAI ITDSMISSSt^ECCVINTSHVSRSVUIIRSKIC 
ENSVVDOSI ItOlARYGSPSMPSLGIGKIXIEIRKAr IDE^C IG^K3VKL0NlJCCYIK^ 
PDKKLFVRDNI I IVPOGTHI POMYI F 

CPa_06O8 699690 599016 

•Uridine 5 ' -Monopnosphace Synthase (Uittp Synthase) -truncated? 
\^FLYFViaWRRLWR>OCIYEDAKIJlC0AVAILY0IGAIKFGKHIL^SGEETPLYVI^ 
ISSPEVLQTVATXrWRLRPSFNSSIXCGVPrrALTLATSISLKYNIPMVLRRKELQNVDP 
SDAIKVEGLFTPGQTCLVINDKVSSGKS I lETAVALEENCLWREALVFLDRRKEACOPL 
GP0GIKVSSVFTVPTLIKALIAYCia.SSCOLTLANKtSEILEIES 

CPn_0609' 699672 699986 

CT490 hypothetical procein 

0^mtNSLr RENMLIRLF W I SLPKGFPLYLEP PLVXJVTFOCTOFVCTYSEA-mPLYIDNL 
NLNYHYTOELLYKAVPCNYKS lYREI PLI I FPEVLIG3TPT0STE 

CPn_0610 701450 700029 

rho-Transcripcion Termination Factor 

RIFUtFKGSIMKEERSSEILPRVKETKKHAYVSMOEKSCVGECAWASESEEAESVTVTK 
lAKLORMG I EELNI LARCYGVKNIGSLTKSOWFE r ^/KAKSERPDELLICECVLEVLPDG 
rCFLRSPTYNYLPSAEDIYVSPAOIRRFDUOCGOT I rCT IRSPKEKEXYFALLKVDKINC 
5TPDKAKERVLFENLTPLYPN0R IVMENGKDHLAERVLDLTAP rCKCORGLI VAPPRSGK 
TVI LOS I AHA! AVhfNPD rVL IVLL r DERPEEVTDMI ROVRGEWASTFDEOPERHIOVAE 
MVI EKARRLVEHGNDWt LLCS ITRLARAYNTVOPHSCK I LTCGVDASALHK PKRFFGAA 
RN I ECGCSLTILATAL I DTCSRHDEVI FEEFKCTCNMELVLDRRLSDRRTYPAI DL I KSG 
TRKEEtXYHPSELERVYLFROArADLTTIDAMHLLLCRLKKTNSNAEFLLSLKE 

CPn_06U 702133 701420 

yacE-predicted phosphatase/ kinase 

RR^^RRDAKTSEREa3ISYDFIRSYSCE\*L^JWKKLCPi1LKU.KV3ITCDL.SSGKTEACOVF 
s/ELGAYWr:ADE ISHSFLI PHTR ICRRVIDLLCSDW/DGAFDAOAIAAKVFYNSVLLOC 
LEAILHPEVCRIIEEOYHOSIODCNYPLFVAEVPLLYEIHYAKWFOSVILVMANEDIRRE 
RFMKKTCR3r;EOFD0RCCRFLN\^EKlA?ACJVV\'EhJrXrrKKELJ<0K r EEY^ 

OrnJ»».U VO-lriRH Vl^J0J2 

(M>IA-UNA Oi*lymi?i.i-.:c I 

Kf ; n fT-'JUJU-P/EfiPRR EV/\MKKLFVLDA:k:F t FRAVFALPEMKNJ (QOOATOAVFCF r R2L 
NKt.t KEF^PCrM i:;VFDi;r-NNKO::RyA (YADVKJNP/jKKFEDl tALVKEYCSLICUX 
Yr.FJrE::VEAOf>V r AS l AKKAHEENYKVriWTAPKDULyLVNDMWAWNrWAiy/^WG VIZ 

V I EHv; I pr* :m [ poyL/\lvi;d:':;ton i fvjlfxicopkkaaallkofcsvegllenldavkcl 

.X'mL::ER0ETLKU;KUI-\UJ3i:NI PI PV*prE5LTFP0HPVDEDCL rHFYIO<y^FKTLVP 
:;K(/rEAA'TV[)Vrj f I KPACirLTN tt/JLVCVTi'.Dt AFAyA*rrr:MIILr-':{.KLECLALT0C2nVF 
F I M XEBTTK I LI • t r .KOKt* t .HEDt.TFYOYNI RDCI lALLNA* : ( V t RE [ :> YDLALAEHLTN 

' y y :k t:;iVM.LVNiK:rrrTAMnFAKFwi'.N:v:Lrici'.LPEyrroYrnErvAYLP i ikdail 
Kt:tr it'KNi>iiKt^:u t ^>^rt.KKV^F::MEIUl:^»TLDVF.Fi-^r LI j\urFrrELAVLTEE iydl:; 



•^RPFNtKJPKCLSDtLYTlEL ^ lOKAK JTPAr/LE.\LKJf:tir: C SKLL^^^^ 

tyvkalpkovdshtor I Hp:;FDCTCAi.-n.'.RUv:ROFNu;N' : r . p-er., » w^ kafRwE K 

N5YFL3AOY50 1 TOfrCAKICODKajqFWfrEJGECX! WI^ 
fCTVNFCrVY^ArctyUCVlIKPSrCEAL^*::AYF3RYP^^ 

MUIRER I I0SWNEFPC3RAAGGRFAVfn-R I0G3AAEL I KLAMLDISOA I KOOQMKSRKU. 
Q IHOELLf E^/PEEEIEEMCPLVREKMESAMTLr/P t WN: L :GWWAEC 

CPn.OtJl* 705no2 "iVWS:* 

\'VMKT:.w:^i " ta:" • r.v.v- * :\ .\ : 

KTAPtlAVr EMKDVIAJ3KfJTAKT IL'N; LEJ.- EKA^LKDRVKG IVZDMDCPCGEVTEIOR 
rYSMLRFWKERKGFP I Y I •r/NGU:ASCC\V/3CAATKI YATSSSLICS ICVRSGPFfWK 
BCLhmYGVESDLlTAGKDKAPMNPrrPWTSHDREEROATLDFLYCQFVOrVTONRPLLTX 
EKLVHTtXyU^IFSPEKAK0BC-/:OV%'GATKE0VlCDrVAVCKtErWYRVXGS0CTO^ 
VASAAASSPLVTGMI KHOr LPLSHDAAY I PPYLM. 

CPn_0614 707435 705793 

adc-ADP/ATP Trans locase 

VFIWOCVGKEFMOSSD^PFSRUWYUrPIYKSEFSKTVTLFLIJ^rrVGFNYa^^ 

TLVIVCSOACAEVIPrLKVVCrVPCAVIVrMVYCV^^SRYPRDrr^TYC 

AVI lYPVCDSLHLNSWDKLOELLPQGLRCFIVKVRYWSYS lYYVHSELWSSWLSMLP* 

CCJWtTrZTEACRFYALINTCLNLSSICACEISYWMGKaTrV'AYSFACDSWHSVKLNLT 

MLITCSCLIMIWLYRRIHHLTIOTSIPPSRRVIAEEGAATAKUCEKKKPKAKARNLFLKL 

iosryux:laiivlsynlvihlfevvwkdqvsciyssh\^fngymsr:ttligvvsvi-aa 

\n-LTC(XIRKWC>rr^«ALVTPLVMLVSCLLrrGTIFAAKRDISIPGCV^^ 
CaW/l^RGTKrrFFOQTKEMAF I PLSPEDK^WGKAAr DCWSRICKSOCSLIYO^ 
I FSSVAASLNVI ALVIXI IMWWIAWAY ICKEVYSRAADAVATUCOPKEPSSSIVREXO 
ESVEOEEMAVL 

CPn.0615 708149 707634 

pgsA-GXycerol-3-P Phosphacid/lcransferase 
LAKIMROFCm^£^RI>riJ^YPCQ£KLHIRLLAIVGAMLSOVLIX;^^ 
ILDPITDKVFVrVCITVLYMBCSLSrAHLTFICARDLFLIIFVCYl^LVKC^ 
FWKI FTWQFI ILLGVTAOGEI PWrCLVPLVALCFLYFLERIMDYKKOFLR 

CPiX_0616 708704 710137 

dnaB-Replicative CNX Hel lease 

TLTWESSLLMDKSTCT^LPSPPHSKESEMIVLCCMLTGVKYLNIJ^ 

KIIFRVLODAFKODKPIDVmJkCEELKRHNOITVICGPSYLITIAEFAGTAAYLEEYVDI 

IRSKSIU«MISTAKEIEiaUa.EQPKNVAEAI^EAONSFFKIS0STSVSQraW)ICU^ 

LTTTTDKPYLVOLOEROELFLONAOCa«SFFTCIPTHFIDLOOLIHCrSPSNIiai^ 

PAMCKTAIJaNIAQJLCrONRLPIGirSLEinVDOLIHRMrCSRSEVDSKKISICDLSCH 

DFORIVS^aNEMOE^frU*IDa}PClXVSOLRARARRMKESYDZOFLIIDYLOLLSCSCTL 

RATESRQTEISEISRKUtTLARELMIPILCLSQLSRKVEDRANHRPWlSDLRESCSIBOO 

SDLVMFLLRREYYDPNOKPGTAELIIAKNRHCSIGSVPLVFEKEWRFRNYSAFBCIS 

CPn_06l7 710481 71231S 

• oidA-FAD-dependent oxido reductase 
LMVmiPIAYDVtVVGACKACCEAAYCSAXMGVSVa^TSNLDriAXLSCN^ 
rVREIDAtXX3IMAEVTO0SCI0FRILN0TKGPAVRAPRAaVDK0LYHIHKKRU^^ 
HIMOATVESUJJKEXnaSGVTTKEXamFSCICTVVLSSCTFMRCLIHIGDRNrSC^^ 
SSQCLSEDUCKRCFPISRIJCrCTPPRUASSlNFSCMEEOPGDLQ^Cnwr^ 
OLSCriTHIKElCTmiSANUlRSALYGCCIEGVGPRYCPSIEDKIVKFSDKERHH^ 
PESLKT0CIVANSL5T5MPFDVQYDMIR5VLGLEI1AI ITRPAYAIEYXJYXHCNVIHPrLE 
SKLIECtmXWINCTTCYEEAAAOGLIAGINAVNKVFrmPPFIPSROESYIGWLTO^ 
TQtIJ3EPYRMFTGRA£HRIXLRQDtWCARLSKYGYEU:U^E£RYELVKKQN0U£E^ 
RMKTTROYGOSWSLAKALSRPEVSYmLREAFPNDIRDLGAVLNASLEJffilKYOT 
RQKILXQSLEKAESLLZPEOLOYKQITALSLEAOEKLAKFTPRTIjCSASRISGIASADZQ 
VLKIALKXHAHK 

CPn_0618 712300 713010 

IplA-Llpoace-Procein Lipase A 

KNHPTTICIFUJLRCHSILKQI^IEEALLRVANQNFCIINSGAKDSIVLCZSRNLNOCVH 
I SRAOAWrPI IRRYSOGGTV^IDSNTL^^;SWIMNS5EASA0P0ELUMTYCIYSPLLPN 
TFSIRENDYVLGHKKlOGNAOYIORHRWVHHTTFlWDIDLDKLSYYLPtPQOOfTYRW 
SHEEF^TTLRPWFPSRDOFLERIKASGSLLFTWEEFLDNELEEILfVOPHRKATTVLN 

CPrx.0619 713462 713013 

ndk -Nucleoside -2 -P Kinase 

RRYVYTMEOTLS 1 1 KPDSVSKAH IGEI LS IFEOSCLRI AAMKMMHLSQTEAECFYFVHRE 
RPFFQELVDFKVSCPVAA^VLECANAVSRNRELMCATNPAEAASGTtRAKFGESICVNAV 
KGSCrrLENAAVEIAYFFSKIEWNASKPLV ^ 

CPn_0620 714145 713519 

ruvA-Holliday Junction Helicase 

DKMYDY I RCTLTYVHTGAIVIECOGIC'/H I AITERWAIEC r RALHODFLVFTHVIFRETE 
HLLYGFHSREERECFRILISFSGrGPKLALAILNALPLKVLCSWRSEDIRALASVSCIC 
KKTAEKLKVELKOKLPOLLPLDSRVETSCTHTTSSCLEECIOALAALCYSKIAAERMIAE 
A I KDLPE3GS3LTO I LP lALKKNFSCVNKD 

CPn_062l 714707 7 14 144 

ruvC -Crossover Junction Endonuclease 

L3RLCSSFKDNKPKVF0ESIVSELI IGVOPCTIVAGYAI lAVEORYOLRPYSYCAIRLSS 
CKPLPMRYKTLFEOLSCV-LDDrrOPNAMVLET^OFVNKNPOSTMKLAMARC I VLtAWROI 
L I FE'MPWAKKAWCKCIIAiJKROVQVMVaK I LNVPEVLMPGNEOtADAFALAICHTHVA 
RSPLCCyR 

CPn_0fi22 7l«;7f,l 71470) 

i.TMlJ nvix*tnnr. ic.il protein 

f< Y;.**yRLL:; I LKLHLF::LH::rfSr:Lr;t'Hr/Hr:{:::R::MUILLCftWKDAOIMEWrX> tCNILSCV 
':::PMr? :r.WJMOKetOD:^. 'tfOEIIEn l IlUjYRE0L::ALEEE'/RRREEAICtJ0DLEKL00Orr 
Wl/Zlftl y*EKl-00 ( RI lOi^n [ t PE £KKELLO:^VR'I*E I ::Br:RFLCYni (K I KOLEEOtORYVS 

OH' lAi*:; 1 1: £ eedk J: ;.\avak imp lkk:;l t ouvekd t y i rrn ii\F. t aklhekujrobcao 
I': :; :t?/r::: I EK LTi vyroi ai:kkka t aluc^h i wwft (^M'DI ,t ik EKtxAMPcrJTKrjwuc 
' lUj :KK(+:::FVr>wt-::t:::K::i* ;:: 

'.T'ltJ)*.:;! 'I'd I I 7|..|..i 

'T'.'M tiyfkirh'-l i.Ml i'i..'..t(i 

I K'lVfr/ wrmui'V f FTtv i r: ;i'i7 r/Ku^viiNTKnr^onfKTWt^N i w r ::u»i r< >trnc:d 
ii:;KiKLvr'Ai:r>YF7yMiVRi'iT iriLKAv:!.!*!;. :vK iArx:i*EALiKLTK.'rrPLr'VroEKrLA 
1 1: :i 'Kh? rntrr: :r-: :kkkk k k\kk t :i-r« ikkwkkkkk i ::i*M*t hike r AFViv;A:;fjK t LtrrvK 
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F.ELWEE::':.':f/E:yEOKKr.-:u-pppAKLtnEvtr:c. jpwtsadlneslcalvressdl 

rNALL:;AUUAIHFPETEEEPTaAjrEE:;3AHFFPErrSSATE£E 

(:Pn_Oii24 7l80lfl 717011 

^jpA-OlyreroLdehytle-i-P Oehyrogenase 

AMKWr NCFCR r"RLVLR0IUCRN3SVEVTJ^ r^raLVPGDALTYLFK^OSTHGRFPEDVRC 
E.\DHL r VOKPK IOFL3ERNVONLPWKDUrA3LVI ECTCLFTKKEDAEKH IQACAKRVL 15 

Ar^.Kcn X rTr.wyMHKTFfiPFKDr/t.TNAsrTnjrrLAP TAtcvu-oriFn iteclmttvha 

• ',*,■' • •■ -*Ki '^r,'C/:i" ; ; .\;r:> ;;u\*'A'.':;-:Lt :-:Lr ' :KLT';MArRvr- ' 

xalndrffklvawydnetcyatrivdlleyveknsk 

CPn_0625 718488 718060 

rU7-Ll7 Ribosomal Procain 

VKOHARKKFRVtJRTSSHNRCMlJtfMLKSLIHYERIErrLPKAKEIJtRHADK^ 
U<ARR r A ICRI>(VHYNKLTSKEAROAKCCOTSVyNVDRLVVNKIJT)ELC^^^F^^^ 
RILKLONRIGDNAOKClIEfXAS 

CPn.0626 719670 718495 

rpoA-RNA Polymerase Alpha 

WLPAIOOtAQSVVUIKEKCWSDNAHNLLYDKFELPEAVKHLPV^ 
CHGHTLGNAIJUlWLilCIXAPAIISFAKrcVLHEVMAIECVIE 

PMODSSLGRTTOVLKASIS ^DASDtA^ANGQKEVTLQDLLQECDFE AVNPP QVIFTVTQP 
lOLEWLR I AFGRGYTPSER m^KCVYErVUJAAf'SPVTLVNYrVEDTTlWMOT 
LVLIVETORVTPKEALAFSTOILTKHFSIFENMDEKKIVFEEAISIEKENKODILHKLI . 
LGINEIEl^STTCLSNANIETrGEL^aMPEPRLLO^RNFGKKStX:EIKNKIiCEMKL^ 
GHOLTUFGVCUlNVKEKKXWyAEKIRAXNTKC 

CPn_0627 720059 719640 

rsll-Sll Ribosomal Protein 

FLrRSRVt.VKN0AOAKKSVKRKOIJCNIPSCVVHVKATrW/riVSITDPAGW 

VGYSGSRKSSAFAAWAAODAAJCTAMNSGIiCEVEVCUOGTGACRESAV^ 

IRDETPVPHNCCRPRKRRRV 

CPn_0628 720461 720063 

rsI3-SI3 Ribosomal Protein 

DAYTILREAORMPR I IGIDI PAKm^KISLTYIYCIGSARSDEr XKKLKLDPEAfiASELT 
EEEVGRUISlXOSEYTVECDLRiWVOSDIKRLIAIHSYRCORHRLSLPVRGORTlCrNSRT 
RKGKRKTVAGKKK 

CPn.0629 721881 720487 

secY-Translocase 

KI RLHlPYMTTUlOFFLrrELJlOKU^rALLTACRVGVri PVPGI^CEIAVAYFKQL^ 

SGo^JLFOua)I^SGGAFAO^^rvIALCwvPYISASI ivolflvfmpalormessdockr 

RIGIU^TRIJTVAIAVIQSU^AKFALRMNLTIPGIVLPTIXSSKIJ'GVPW 

TTGTlXUWIGEOZSDKCIGhCISLIZALGZI^SFPSVI/SSrVNia^AjCSQDSSD^ 

rLIIJa.VFV!^^ITTILIIECVRKlPVOyARimGRREVPGGGSna.PUCV^ 

ASSLLKFPATICOFIASESSVIHKRI AALLAPGSLWS ICYVLLI irFTYFWTATOnf PEQ 

rASEWCKNNATIPGIROGKPTQHYIXYTMhWVTLLCAIJTAAIA 

SYFlJQCTAMLIVVGVVUm«afVDArLIi<RRYDSVLKTDRT^^ 

CPru0630 722316 72X885 

rI15-LlS Ribosomal Protein 

MIKIXSUT)ISERKRRiaa^RCPSSGKGKTSCRGHKCrx:SRSCyKRRF<OT30CVPLYR 

RVPTRGFSHKRFOKCVEEITTCRIAELFOECEAITUmKAKKAIAROAVRW 

EKTrVWODTAWLSQGVONLLGIT 

CPnJ)631 722812 722312 

rs5-S5 Ribosomal Protein 

EEMSLSKNSHKED0l£EKVI,V\mBC5KVVKGCRKFSFSALILVCn:KCRIXrrGFAKAN^ 
TDAIRKOGEAAKKNtJ«IEAI£IX:SIPHEVLVHHDCAOLL^PAKPGrGIVAGSRIRLIL 
EMACIKDIVAKSFGSNNPMNOVKAATKALTGLSPRKDLUUICAAIND 

CPn_0632 723354 722827 

ril8-L18 Ribosomal Protein 

KGLISSVrt-V^JLW\^AP^W^iNLIKVREFVMKMNMSVVKLV^^ 

KSLMKRRRAlJl\mKVLKGSPTKPRI.SVVKT^nCHIYVOLIDDSrCKTLASVSTLSKU^KSQ 

CLTKKNQEVAKVLCTQIAELGKNLOLDRVVFDRGPFICyHGrVSMVADGARECGLOF 

CPn_0633 723760 723209 

rl6*L6 Ribosomal Protein 

SMSRKAREPI LLPOGVEVS ICDDKI V/KG PKCSLTQKSVKEVEITLKDNS I FVHAAPHW 
ORPSCMQGLyWALI SNMNWVHIjGFEKRLEMICVGFRASVOGAFLDLS IGVSHPTKI P I P 
STLOVSVEKNTLISVKGLDKOLVCEFAAS IRAKRPPEPYKGKCIRYENEYVRRKACKAAK 
TCKK 

CPn_0*i34 724215 723787 

rsH-SR Ribosomal Protein 

E3SIKRKRI YMGKTSOSIADtXTR TRNALMAEHLYVDVEHSKMREArVK ILKHKGFVAHY 
LVKEENRKRAMRVFLOYSDDRKPVtHQLKRVSKPSRRVr/SAAK I PYVFGNMG r SVLSTS 
QGVMECSLARSKNtGGELLCLVW 

CPn_0635 724763 724206 

rl5-L5 Ribosomal Protein 

GERKANMSRLKKFYTEEIRKSLFEKFCYANKMO r PVLKK IVLSMCLAEAAKDKNLFQAHL 
EELTMI3C0KPLVTKARNS I ACFKLRECOGIGAKVTLRC I RMYDFMORFCN IVSPR IRDF 
RGFGNKGCXJRGCYSVGLODW I FPEINLDRWRTCX;LNI TWVTTAOTDDECTTUXLMCL 
RFKKAO 

t.'Pn.nti^N 7:5100 724750 

rU4-LJ» F(ibi)<;nRkil Ptnr.t*in 

FKt:KEVMKKLtN [ RVCOKVF I I^NOKC:KECKVU:LTEDK\AA/Ern/NVR I KN CKRSOONPK 

• iKn c; I Fjvr r 1 1 1 utm lt t at :epakljvkvteocr elworr pl/ttijolyrlvrukkc 
'.Hi jm: 17 /::'>47 i /iiso'*-* 

firi l.l'l lnl«i::i)iit.i I I rnr.ifin 

I K m .KVAr-ern:AKKVK( TKVuy;r:fiRRyAr/nr)v r v(^'^'R^VErtl:;:: ikkgdv 
1 KAV I vr<n<Kii iTi'Kn(T:rrt ,KFnrrN:;(n/r i ddk< Mpy/TTu. 1 1*' .>vAnF. MiDR'iF iK iz-sl 

AI-I-VI 



rzl'.* SI' Rit)osom.»l t- 4in 

NKKEKVKSMA5EPPa^RKVKrCV\VSAKMEKTVVVRVER:F:;HP0YLt0;-/R33KKYYAHT 
ELICV5ECDK\^IC!5r5PL3St:KRWRyiEf(yCXVSl": •• j- 

CPn.063? 725970 725743 

rl2T-L2? Rioosomal Protein 

A3GKGrNMAAKKCLlTOUlCK3DDDLa\YVHENKKALFALRAE^LL0NKV'yKV^ 
KN t ARALP/KOEPK'^KVH.': 



rllo-LU Rtboscmal Protein 

tlHLMPKRTKFRKCXJKGOFACLSKCATFVDrCEYAMOTLERGWVTSRQIEACRVAINRYL 
KRRCKVWIR r FP0K3-.TKK PAETRMCKCKCAPDHWVAWRPGRILFEVANVSICEDAOOAL 
RRAAAKLCIKTRP/KRVERV 

CPn_0641 727092 726409 

rs3-S3 Ribosomal Protein 

KGRRIMCOKGCP IGFRTOmWWRSLWYCNKOEFGKFLI EDVRIROFIJUCKPSCQGAAGF 
VVRRKSGKIE^^' lOTARPCLViaaCAOTLXJCEEIJULTCKEVWl^ 
VA£WIAJ«)IERRVSFRRAMiaCAM0SVKDACAVCnnciQVSGRlJ«3A£IARSEMYK^^ 
HTUWDI0YATACAETTVCI IGlKVWINUIDiSSSTTPNNPAAPSAAA 

CPn_0642 727440 727096 

rl22-U2 Ribosomal Protein 

RRHSKFKATARYIRV0PRKARLAAGt>lR^ff^EAEE0LGFSOLKACRCUCXVUISAVAN 
AEU^a^IKRE2^^.SVTE^^VDAGPVYKRSKSKSR0GRSPIUCRTSHL^VIVCEICER 

CPru0643 727725 727450 

r3l9-S19 Ribosomal Protein 

EIRIMGRSUUCGPFVDHHLIJCKNmAMNrEEKKTPrKTWSRRSMITPEMICHTFEV^^ 
FLTVFVSETMVCHKUCEFSPTRIFKSHPVKKC 

CPIU0644 728594 727722 

rl2-L2 Ribosomal Procein 

FIRCrKSKFKXrKPVTPGTROLVLPAFDELTTRGELRGTKSKRSLRPNKKLSFFIOCSSCG 

YILAPKCI0RCDVVV5GCCSPFKFGCamJCSZPLGt.5VKNIDtRPSSGC)CLVR5^ 

QVZAXSPGYVTUOfPSGEFRKUJGCCRATZCEVSNAOKNLIt\^KAGRRJ^^ 

TAhOfPVDHPHGGGECRKNGYXPRTPWGKVTKSLKTRSKN^^ 

CPIU0645 728933 728598 

rl23-L23 Ribosomal Procein 

0MKDPyOVZKRHYVTEKAKKLEHLSAGTCECKiaCSFa®PKFVFrV^H^ 
EAZYVDKNVKV^SVOTXNVKPOPARMFRCRRKCICTSCFKXAZVrrifQCKSVG 

CPIX.0646 729636 728950 

rl4-L4 Ribosomal Procein 

YR£DLKVXX5KFDF5GNKZG£VCVA0SLFADECa;LQLZKCYZVAZRANKRQWSJC^ 
SD^KSTKKPFKOKCTGNAROGCU^PQFRGGGZVFGPKPKFtCHVRZNRKeRKAAZRIX 

LSLRNLTAVKCFVyGXNXNSYDZASAHMZVZSXKALQELVERLVSETKD 

CPn.0647 730490 729657 

rl3-L3 Ribosomal Procein 

YLEYrSYCKmJ»PLZTCPFZFUlQ«arFLENSZSKZLSRFVSLFLOEESKSLlIilD^ 

RSHXSVHCKKBGHZHZF0KDCSLVACSVZRVEPtAArr0ZICrXESDCYrSL01<»E^^ 

AHTirWWSKPKlJCHLRKAGCRVFRrLXEVRCSEEAl^CVSL^ 

GZSKCKCFOCWKKFCFRGGI^SHGSCrHRHACSZGMRSTPCRCFPGSKRPSHMCSAB^ 

VnCNLEVZKVDtXXKVLLVXGAZPGARGSXVXVKHSSRT 

CPn_0648 731636 730605 

CTS29 h/pochecical procein 

FFFKKPCKEVKMAT»ftIRSACSAASI04LLPVAICEPAAVSSFA0KCZYCZ00rFn«>^^ 

AKFVCATKSIXKCFXLSXAVSOCVVCSLEEACCTCDALTSARNAOGMLiaTREWA 

L^XyWPSZVNSTORCYOYTROA^E^/;SKTKERKTPCCYSKMLLTRGDYIX^ 

CATTYSATFCVUlPUlLINKLTAKPFtJJKXTWaJFCTAVAG I WrZNKMACVA^ 

EOKIJlUlAK£5LYNERCALEIJ0050l^OVZLSA£RAUlK£HVATI.KRWLTL££^ 

WtX;VKLZPLPZTVACSAAZSGALTAASACXCLYSrWOKTKSGK 

CPn_0649 732672 731710 

fmt-Hechionyl tRNA Ponn/ltremsf erase 

LNUCVn^fPGTPTFAATVLODU*KHKZOITAV^^^lVDKPOKRSAOLZPSPVKTI^ 
U^PSKASOPOFZ EELRAFNAI^X WAYGAZLROIVLOI PRYGCYNLHACLLPXTRCAA 
PtORCZMECAT£SCrm^IRHI»CMXnt:i»(ANITRVPlGP0KrSCEIJU>ALASQGAD^ 
TL0OZESO0U3LVS0OAALATIAPKLSKEBGOVPWDKPAKEAYAHIRCVTPAPGAWTLFS 
F5 EKAPKRI^Z RKASUJlEACRYCAPGTynAri^ROElAX ACSBCAXCXilEVQVEX;^^ 
SKSFLNGYPAXKUCZVrrLNN 

CPn_0650 733513 732665 

IpxA-Acy I -Carrier UOP-RlcNAc 0-Acyltransf erase 

SRR^IAS IH PTA I r EPCAK IGKDWZ EPYWI KATVTLCDNVWKSYAYXDGfnrZGKGT 

TIWPSAMIG^n<PCCLKYOCElCTYVTICElrcErREFAI ITSSTFBCTTVS IGNNCLIKPWA 

HVAHNCT tCNNVaXSNKAOLACHVOVCDYAI U3GMVCVM0FVR IGAHAMVCALSG IRRDV 

PPYTIGSCNPYOt-AC INKVCLORRQVPFATRLALIKAFKK XYRAOCCrrESLEETLEEYC 

OrPEVKNFtEFCC:JPSKRGIERSZDKQALEEESADKE(r/LZES 

CPn.()65l 733975 73351? 

tabZ-Myriscovl -AtYl Carrier Dehydratase 

MNOPSV t KLRELLTtLPHR YPFLLVDKVLSYD lEARS ITAQKHVT INEPFFWCHFPNAP I 
MP<TVLtLEAUOA.VJVLICLVLEHDRNKR f ALFLCIOKAKFROAVRPCDVLTLOAOFSL t 
f;.':K'r;KAWA0ARVr>S0I.VTEAEL3FALV0KE3I 

crrij)»;s2 7UHao 733990 

Ipx*: Hyiiurnvl lUcN.!^' Lfiiicety tasc 

kkn:; r t Yt:D:;L:\:n*Mr.FJrroR'PLKREVRY3f .'vo i Ht/JKCSTLMLOPAormr; rvpoRo;: 
a:^ ;fr^ENVPAt.i.rHVYTpf :n:rrrL3R0:;AV t atvehlmaalrsnnionl r tocf5:EEi p £ 
*.;onr/.:tmvKUic\}AC. iceoedio/c rAPLTPP'/YYOHOOiFLAAFprroELK i.t/tuiypo 
: ::rr r'rroYK:;Lv ( NE£:r:KRot: i Af *:RTFALYtiEu:FmEKCLiccx>:LDMr\wFKow ; i r 
r.u* y^i j(FAUt:r'VRHK i ltjt. t* ;nu*:i;-A:Rt>FVAHV[j^VG.'y;HSSNtAFx:KK tij:Au:L 
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t:uf.E-Acxii ipoprotein N-AcetvUrdnst-.,.«se 

■ ;EPVLJI r rrrv t .-^ffTL r AFAOPOUWrVS I tjCAACCYCFFWYSLEPUCKPSLPLRTLFVS 
CFI^ I FT r E'-: I HrSWLSDQY ICKL I YLVWLTL r T r L5VLFSCFSCU.VAX VROKRTATL 
WSLPCVWVArCMLRFYG:F3C»4SFOYU>rPKrASAYCR0FCCFLCWACOSFAVrAVNM^ 
YCLU-KKPHAKMLVWLTUXPYTFCAIHYraKHAFOODKRALRVAVVOPAHPPIRPKLK 
S P I WWEOLLOLV.': P tOOP IDLL r FPEVWPrrJKHROVYPYESCAHLLSSFAPLPECKAF 
l^NSOCATAI^OMFOC PVnCLERWVKKENVLWYNSAEVr SHKG ISVCYDKRILVPGGE 

■■• r r»r:KFrr:.r'-Pcr.Fnr^Ar/:rKRLP-,RR.'V7/'/?vRf:LPP tctttt^eetfcyrlosyk 
vti .v:i; .rnr' r.i KvnrrjKiMLtM hMP' -/r^-v -r-^T/T.Vw :lk 
. u: vlpti'.cTkai ;vl.lt. ;Li Lrtj'rKTLV(;vj'jLv:\MiL.iAFL%v/Jv:/j ;c;t- Ljv?,LL<\K 

KEIR t 

CPn_0654 737051 736503 

vdlD/yciA-dcyl-CoA Thioescerase 

KKrrDFLSVDRyYRNOE^PIKILSVESTMUCKKPVSFSCiroHIYXrFPNDUON^ 
CUJlSLIXRIJU.WAERHTESVCVTAFVOALRFYAPAYMCENLICKAAV>mT^ 
VKVWAENI YKOERRHITSAYTTFVAVMECNOP I PVHOIVPETPEEKRRYNEADRRROARL 
ELK 

CPn.0655 737856 737101 

dnaO-DNA Pol III Epsilon Chain 

KEIKSLlJCrmTTCUXrEjnCUJVKKORIIEIAAVRFTFDSVISS 

QRVHH I S^^AMLRtX}PK r AEVFPOIKAFFKEGDYIVGHSVGnJU^T-WEMERIGETF^ 

YTIIm'UU-WCEYGDSP^WS[^IAWF^A^DGNH^W0a3VXrN 

OLKOVUUCPrKMKYMPUJKHKCRCFSEIPLAYLOWASKMDFDSDLLFSIRHEIKHRQKCT 

GFSOVNMPFMEL 

CPn_0656 737842 738048 

No robust nomolog presenc in Genebank/EMBL as of 11/7/98 

THNFLLLPLSLFDIIiTVECFLCLTLYFASVORMPCEQiaiVPGNLYYYYIAAHSSLCLSV 

CKirmEMKD 

CPn_06S7 738476 738051 

y^eE (ATPasft or Kinase) 

PMGRYRRVSHSSOErUiCTELCQ^n:VPGAVIiLrGDYGACKTEFVRC rVSGYLGDTIAE 
EVASPSFSI LHVYGNEPKRLCHYDLYRIDQKNOEYIFQDAEEDDVLC lEWADRLPKPRFC 
• DTINIYITMQTNMEREIIIEKR 

CPn_0658 739180 738455 

CT538 hypochetical protein 

KRVGMDISGAVKOKMrWKOKKPEaj^TYLFYLEOALSLRPVVrVRDKirr^ 
RILEODKKrWRETCIOISSEKPOVNEimairYICPrrcKVFADNVYANPOQArYI^^ 
PQNMEKOOGWI KRFLVSEDPDVIKEYA\^PKEPI IKTWASAITGKIJ'HSLPPLLEDFI 
SSYUlPOTLEEVatmKFQI^SFLSLLQOALVEDKIAAFI£SIJU3UrAFHVYISQW\n7^ 
EE 

CPn_0659 739482 739838 

t rxA-Th i o redox in 

LODWRDSNSIFREGKLHVKIISSENFDSFIASGLVLVDFFAEWCGPCRMLTPILQJI^ 
EU>HVTIGKrNIDElJSKPAETYEVSSIPTLILFKIX»IEVARVVGLKDKErLTI^Ihna^ 

CPn_0660 740327 739860 

spoU-rRNA Methylase 

MRVVUtCPDI PO^^Xa^IGRTCVALGAELILVRPLGFSLADK^VKRAGMIyWDK^ 

SIEEAIJ{D\^EDOIFCLSTKGSASYTEFSLPSSGTYVrCSESKCU>KEIUaCYYKNCW 

PMQOOIRSLNLATSVCrVLYEWRQKTVALQKNPTV 

CPn_0661 741139 740327 

mip-FKBP-type pept idyl -prolyl cis-trans isomerase 

HSRCLKIKDRRRKMNRRWm.VlJ^WAIALSVASCDVRSKDKDKDOCSLV£^^ 

ELSraJQKLSRTFGHIXAROLWCSED«FFDrA£VAKCLOA£l,VCKSAPLTETEYEE^ 

OKLVFEKKSKENI^UVEKFUCENSKNACVVEVOPSKLOYKXIKBGAGKAISGKPSAI^^ 

KGSFINGOVFSSSECNNEP r LLPLOTriPGFALCMQGMKBGETRVLYIH PDLAYGTAGQL 

PPNSLLIFEINLIOASADEVAAVPOBGNQCE 



CPa-0662 



742938 741172 



aspS-Asparcyl tRNA Synthetase 
SKGCYMKYRTHRCNELTSNHIGENVOLAGWVHRYRNHGGVVFrOLRDRFCITOrVCREDE 
OPELHORLDAVRSEWLSVRGKVCPRIAGMENPNLATGHIEVEVASFEVLSKSONLPFSI 
ADDH INVNEEIJILEYRYLDMRRGDI IEKUX:RHO^MLACRNFMDAWFTEIVTPVl^ 
PEGARDYLVPSRIYPGKFYALPOSPOLFKOIXMVOCLDRYFOIATCFRDEDUIADROPEF 
AO IDIEKSFGDTODU.PI lEOLVATLFATOGIEI PLPLAKKTYOEAKDSYCTDKPDLRFD 
UCLKDCRDYAKRSSFS I FUXJLAHCGTIKCFCVPGCATMSRKOUEYTEFVKRYCAMCLV 
WIKNOECKVASNIAKFMDEEVFHELFAYFDAKDQOILLLIAAPESVANOSLDHLRRLIAK 
ERELYS DNOYNFVWITOFPLFSLEDGKIVAEHHPFTAPLEEO I PLLET3PLAVRSSSYDL 
VLNCYE I ASGSOR I HNPOLOSO r FT I LK I SPESIOEKFGFF r KALSFCTPPHLC lALGLD 
RLVMVLTAAESIREVIAFPKTOKASDLMMNAPSEIMSSOUCELSIKVAF 



CPn_0hti3 



744220 742901 



nisi.-HiiCLdyl tRNA Synthetase 

PIFEKSEVrLHVCEESDVA^EVYSFU3RKGRSMTLRPECTAAWRSFLEHGASHRSDNK 
rraLPMFRYEROOACRYROHHOFCVEAIGVRHPLRDAEVLAtXWDFYSRVCLOHMOIOL 
NFLCCSETnFRYDKVLRAYLKESMGELSALSQORFSTNVLRILDSKEPEDOEtlROAPPI 
LDYVSDEDLKYFNErLDALRVT-PIPVArMOor vorrr nwenr t/t?e>»«¥-w^com»<n#k» 



'.t-TiJH""l M4775 7445S7 

:J(. toliiit:! hniiuiltxi pi.jsenc in (>inebanK/EMBL .i;; ot ll/7/oa 

r H> 'lvh YB WV ' ^ ^ KCrrTNKEHDAHATVLKT^RAKYNLFFVODVFnVHEV I EP 

Mlifx; M.-x»):;plU'::ptuir.T Tcmuport 

KMfA/vrrK n vr*i'Kii i kf. I EoyEWKKKYK'A^m m r kv::mf nrr i iyyftrkcftfamptl 
I Ai .t t :KnKA«..) * : 1 1*:: rrLvr: :y': t.':KFV.':cvMSt>3r;ripRyFMA u \Ui itcltn ifftmss 

: ; 1 71 .KAl WW ;i MKtJi S^Wf 'Pt:AH LLTI IWMKJ Ef-Z.TWW; :VW; .Ti;i tN lOCAL I P I LTCF 
I I f .Y:;* :AM-^Vh ; t U'I» ;MGLVl.INnLROTFOr;t/;LrP f EKYKROPHMAHIIEXJKSAnE 

' ri-KK ( i;i<i :i .: rr« i-: t r .m-vr ;iTK>/r wfiwaaiiff i y f vRMAVNDwr;AtF l i etkhyaavk 



ANFCVSLFE IOCaX>(LVAC/*---*CK [ 3Kt:Nft^PKP;VLFJ:-j:.LrAI LOMWF-RSH^^>WV 
OCrr-LFV ICFFLYCPQW IGLAAAEtJ3HKKAACTAS';rrC>fFAYFaAT 

wcvocGFFrAUJvaksiAiljlF ( *■ " r ' 

CPru0666 746370 750107 

dna£-DNA Pol III Alphd 

CFFL1VI PLHCHSOYSVIXAMSS rKOFVAKOOEFCI PALALTOHGNLYCAV'OFYKECTOK 
TOP T tGCECY T APn.-^ P F0KKKEKRJ;RAAHHL : LLCKNECOYPIJI^ I LTSLAFTBCFYYF 

n- :t'Kr:.LP'.y.;E :: : ^A:.//^r_\;.:.:.::;^wK'■.L■Lrr■_: v-T-r.'iLiiK 

yjiiEJ lACF K££WwKvuV.' : : V. vr.v.:.:<A. K;^L.:: t r/\rNo::tv;:/A:jwcAH 

EILLNVOSGETVRI AKONTHI PNPKRKV^-RSREYYFKSPAOMAELFKDI PEV tSNTLEVA 

KRCOFTFDFSKKHYPrYVPESLKTLNSYTEEDRYOASAVFLKOUVEEALPKKYSSEVlAH 

lAKKFPHRDPIOrVKERMDMEMAI 1 1 PKCMCDYlilVWD t IHWAKAICI PVCPCRCSOW 

SVU-rLLGITEIEPrRFDLFFERFINPERLSYPDIOrDICMACRERVIWYAIERHCKENV 

AO I rTFCTMKAKMAVOCDVCRTLrMALSKVNH I AXHI POLNTTLS 

AESAQVr tJKALCIXCS IR>nT:VHAACVI ICCDOLTrmi PIC ISKDSTMITTOYSI«PVES 

VCMLKVDUjGUCTLTS INI AMSAIEKKTGOSLAMATLPiOOATrrSLLHQCKTMSIFOKE 

skckjelaicnlrpdlj'eeiiahgalyrpgpmdmipsfinrkkgkeiieydhplmesilice 
tycimvvoebvwiagalasysu;bgdvu«whckkdfoqmeoerekf^ 

LAT/irDKMEKFAAYCFNKSHAAAYGt.mTrAYUCANYPKEWlAAIXTCDSDDIEKIC^ 
URfJWSMG IPILPPHINVSSNKFVATDECIRFAMCy^I KGIGRCLIESrVEEREHHGPYE 
SrRDFI0RSDUCKVSKKSIESL2DACCn>:F0SNROLIXASVEPLYEMAK0KKEAASW 
MrFrrLGAMDRXNEVPra.PKDIPTRSKKEIiKKEKELLGIYLTEHPMimW»{LSRW 
VlJ^GEFEM.PHGSWRTlff'riDKVTrKISSKAOKKrAVLRVSICIDSyEU'IWPCHYEEO 
0ELIXEDRLlYArLVII)KRSDSUlISCRWMKDl^IVNOIII.YEC00AFDRIICN(:VQia« 

^MSTSGK^^KAKC^«p^^ocK^QAIJ^pvTLSU^L^EIJu^SHLcrtJ« 

VFTODNERVASMSPDDAYFVCEDIEELROELVTADLPVRVnV 
CPIU0667 751097 TSOITT 

No robust homolo9 present in Genebank/EMBL as of 11/7/98 

NISLLCKlOKRYFWaCLILYFAAFVASLFCCVFLWDRVPCAOKIMRlAADHSSEVFSKSC 

RnWCISGFEn,QVrERMVSPEOALALFPEYRDGKSFVaAriPKTLKHVRfSKEEPVKK 

HI ISOECEILWSLVNGEMNOimnwrcSKGFREaJLLHAGKOaiRVIQTLATtOTT^ 

SLAOALAUCNIRAERVIKECOKKIaIFASCN0IGTH^0OFOPIRCCTTrL^I^C«»VWL0X^ 

RKAAVFPAQYSEDRVRHLVlWirCDNrLIVRSSKVYVIVYKISLVSAIWSVRVEYINAVT 

GKSFQDL 

CPn.0668 751176 752162 

CT547 hypothetical protein 

WRFVWSPRLIMKTLLYVPLLLVLVSTCCDAKPVSFEPFSCKLSTORFCTOHSAEErrSO 

GQETXJOtOirRKALLCFCIITHHFPRDILRNOAOYLICVCYFTQDHPOLADKArASYLOL 

POAEYSEELFOMKYAIAORFAOGKRKRICRLECFPKLKNADEDALRIYDEILTArPSKDL 

GAOALYSKAAU^IVKNDLTEATmJCKLTLOFPLHII^SEAFVRLSEIYljOOAKKEPHW^ 

QYLHFAKLNEEAMKKOHPNHPLNEVVSANVGAHREHyARGLyATCRFVE^ 

YRTAITNYPOTLLVAKCQKRLDRISXKTS 

CPn_0669 752X40 752775 

CTS48 hypothetical protein 

IEYI^rLPKIEI^WRLFSLGTIYL^^SLALSSCCCYSILNSPYHLSSLCKSLLOERIFIA 
PIKEDPHGOLCSALTYELSKRSFAISCRSSCAGYTUCVELL^rcIDKNlC^IYAPNKLCDK 
THRHFIVSNBGRLSLSAKVQLINNITrOEVLIDQCVARESVDFDFEPDLCTAHAHErAXfiQ 
FEUHSEAIKSARRILSIRIAETIAQQVYYDLF 

CPn_0670 752738 753196 

rsbw-sigma regulatory Cactor-histidine kinase 

PRRLLNRYTKTFFBCETVFPAVI^EIJiSMLDLIKRAGKOSKCPOEKLUtt^EUCEELLVM 

IISYAYOGENSPGTIAISCISHRCOLEWIKDHCPSFNPWVSINIOE33LPLEORKLOCL 

GIFLAKSSVDEFLYAREOHCNIVKLKHLNGOHS 

CPn_0671 7S3660 753205 

CT550 hypothetical protein 

RITINORKYTMSLDrTEEFYHOSILrrrCTSFPECYLNIAEILSYPHCTDAIfrDFLCSOSD 
NDFIIAESKDKLTLF^iADFAIWLVPELVOGOAVTRCYIAVS0GBC^rtXPE>^AFEASGOYN 
QSSLILEALOLYLKDIKDTENALRSFRFmW 

CPn_0672 753723 755048 

dacF(pbp5) -D-Ala*0-Ala Caroxypept idase 

T r KSPHMKR PFFTYLCI IFYCSCASLSLHACLSFPEVRG ATAAWKADSCKVFYDICDrOA 
VIYPASKTKIATALFIUCHYPTVLOTLIKVKODAlASITPOAKKOSCYRSPPHVfLCrDCS 
TIOLHLREELLCWDLFHALLVCSANDAANVLAMACCGSVEKFMOKLNrrUtEEICCTHTH 
FN>n»HGLHHPNHYTTTRDLtSIMRCALKEPPFRGVISTTSYKICATNLKC*ERlLSPTNKL 
U.PCSTYHYPPALOCKTGTTiaACKNLIMAAEKNNRLLVTIATCYSGPVSOLY0C7VIALC 
ETVFNEPLLRKELVPPSDCLOLEIANLCKLSCPLPEGLYYDFYASEDREPLSVSFIAHAD 
AFPrEOCDLUJHWVFYDOEGKKISSOPFYAPCRFERTIKPWKLYKKRVTTSYRTYMSITM 
LLMYFR 1 RKHRKYKNLKHY3K I 

CPn_0673 755242 755463 

CT552 hypothetical protein 

GKSTBCKAYHCFLKQVSIALNREE\A«3NPHHLHFILMOF0OFSGE0DRFGSFLEATIROR 
VSFLVLOEKIATLK 



CPn_0674 



75&*»g5 755577 



Cnvu-RNA Methyl trans cerase 

RG t LYVTMVPFROHHAYOLLKOLHTSA r SEAORVSYYFKONRSLCSKDROWIONt IFNIL 
RHRRLLETLI LOSCEOVTPEALVAKVNECVLENLOSYSA I PWPVRYS ISODLAHFLVODY 
CEEOAEE r AK EWLTEAPIT I RVrrrOK I3VKEL0EKLEYPSSPCELPEALHFSKRHPL0ST 
EAFRRGFFE lODENSQR I SCO I ZVTDKD t VLDFCACACGK3L I FAOKAKHW INOSRKAI 
LOTAKHRLLRACARNFSLALWLP.UiriF.'TWtVDAFCarrWFRRHPEHKWOrSKKLLLNY 
*/R VrjKS t LKOASAYVCPRGRLW ITCrXLKEENEAHVAYMHSUIWKEVHRKTLPLOVCKC 
DAFFTSMFOKt 

(TT*/)K hypothec teat p cot** in 

7Pl.';m r LDFOFS t WL«vt4:t-\ I u u 7VH \ I ^\vi /i*KHi .f^LUAwi vNi ii 't .rrNYOTin/TTr r 
r<yv mELF:nrf;;A lUY;: l::::Iuay^l r r l^uii- KKfvrr wi.Yp.LFKrr.rvi 1 1 kka ivoklcm 

FKrXrLrEr;KRPVDK[V(.^V\NKVFnK^K:;Nr:::WKI)FTIIL-/TV::rVvrri-V:FWRRU^ 

DA.':u>1£rD\Lr^LLa:^^^AYL^u•:r,Ru/A.>K^f;^:KA0l•^Jrru:^:K::•A/^I.nta.rc^^ 
r;AEDFOTi FMS [ r:":D::ij;tVLAi(:xi';rjyi'i;rKiK :KTP'/':t><>t':rA(^;:i*r:u::Kr au:fl 
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PCT/US99/26923 



tiomoioqou:: CO CT69S 

-•JMrrrP tSCNDCDRNT r SDPLEE3AAEECD5DLEDRVSESAT0Vr ET I ADTC I PEATPSEX3 
TrCDLN5DLVDRVEYEARC5U.TT>aJVR I RKAVSO rW^«VKTKRHPKEOCW 
LLKATRLPKETAEPPYFTALETAIJ^CRSrrFHVFLRLFTLLiUlOHPEAPLDLCCTrDPIS 
PEAAVAFAL I LRSCCKWATDAVQECLPLEV I EEACMYNATSLEATnVEEVSKRLSELL 

VrrOKR r DC LJVNVRC ITK : itsf^u:acoc/s\M3nlktydlcrnytqvw 

K^.FJJFALVMKD r r.Vr;/R0nRrKnr.DFUfl*WSEEHA3EVNYDVVlJkILr/N^ 

*:" *-\v-.Kr::ir,-v: w'v ' :ki rr 
CPn_0-677 760410 759256 

No robust homoiog present in Cenebank/EMBL as of 11/7/98 
R I AMC INPSCNRSPDOVWRGAOCDSSSTOGTCATOSNUaWnm'STSOPOVASK^ 
WTTVREFFLGKKSPDSSCXIASGPAMOSPSGPTIRPTRPAPPPPTTC^ANAKRPATKOT 
AP0PPTAGSSSGSE0PTAKSSEVAJaVSaja3AVHSHAES0KVLKKVS0EIOTC*m»re^ 
^mCPOYriHCYRVIARA^OTV^E0SMLlBCTSSTCPVPOA^^^VAKnAVTQT^^^ 
ENPKPGNDPDGVU4QWISLCI BCPTU}PCES lONTt^RVSOrOCDDSDIDyrSDIARL 
■33AUJRVRENHPNE«PRIWIAIAR£LCyUkVHSHATSVRIANAGKNHTRDVAr^^ 
LOCMKVLSVCAWANTmVLIGDLFE 

C?n_0678 761329 7606B2 

No robust homolog present in Genebank/EMBL as oC 11/7/98 

KII^^SVNPSGNSK^rot.WrTCAHDOHPDVXESGVTSANLGSHRVTASCGROGU-WlIKEAV 

TCFFSRMSFFRSCAPRGSOOPSAPSADTVRSPLPOGDARATECACRNLIKKCYOPGMKVT 

rP0VPCX»3A0RSSGSTTLKPTRPAPPPPKTOnT*AKRPATHCKGPAPQPPlCr^^ 

ATKGKCPAPOPPKGILKQPCOSCTSGKKRVSWSDED 

CPn.0679 762936 761725 

P9k-Phosphoglycerate Kinase 

HLGRPKGOGFOEEYSLOPVVDVUXniiCHHVPlJVPDCVGEVABOAVW 

RFHIGEEHPEKDPTFAA£I^SYGD^fV^roAFCTSHRKHAS^mArPOAFPGRAAACLLKEK 

ELEFLCRHIXTSPKRPFTAILCGAKISSKIGVIEALUCVDnilJlQGM^ 

U:NSLVEICSALDIJWN\^rAKSWiVTrVLPSDVKAAD(WSKE^ 

FDIGPRTTE£FIRrrN0SA^VFV^GPVCVVTVPP^OSGSIAIANAUamPSAVT^^^ 

AAAWALACCSTKVSHVSTGGGASLErLEOGFLPGTEVLS PSKS 

CPn_0680 764254 762971 

ygo4- Phosphate Permease 

YSMLPLIlFVLrX:GFr^SWMIGA^roVANAVGPSVGSGVLTLROAVVIAAIFErFGALLLG 

ORVACTIESSWSVmiWIASGDYMYGMTAAIXATCVWlOLASFFCWPVSTTHSI^^ 

GFGLVLGKCTI lYWNSVGI ILISW LSPFMOGCVAYLIFSFIRIWirTOroPVLAKVR^ 

PFLAAI*VlKTUrrVMISGCVIIJ<VSSTPWAVSGVLVa2J:^irrFT^^ 

PKKGSLTyRUCEROCim;RKn.VVERIFAYL0IIVACFMArAHGSNDVANAIAPVA£^^ 

OAyPASYTSYTLIRLMATGG ICLVIGIAIWGWRVIETVCCKITELTPSRGFSVGHGSALT 

lAIASILGLPISTTHNAA/GAVLGIGIJUlGIRAINUJIIKDIVLSWm 

FALRALFK 

CPn.0681 765001 764258 

CT691 hypothec ical protein 

tCIRSHKSFTRSFRCfVIIAXKAILMQTIJUlLrGOSPFAPlJOAHLEKVVSa^^ 
UUXniYEEIXQlAKLVSDKEYOAIXIKNrMRNHLPAGLFMPISflACILEIISIODSIAOT 
AEOVAIlXTIRRLNFYPSMETLFFRFLEKNLEAFELWriXHEFTJO tI£S^ 
RLLVCRVAKSEHESIW)R£I^IFFSDDFIIPEKErYLWLOVIRRTAGrSDSSEKLAHR 
INKTLEEK 

CPn_0682 764912 765955 

dppD-ABC ATPase Dipeptitle Transport 

TSKCU^KNSLFRNNM.PICRSaCRU^AS^^>It^IEDLSITLAKOROOyPI^ 
OTIAIIGESCSGKSVSAHAILRLLPCPPFSVSGOVNFQGHNLLTASRSIQKKIIGTEISM 
IFONPOASUJPVFTIEOOFREI IHTHLW.TAEVAKEKMLYALEErGFHDPRLCLNLYPHQ 
LS0GMU5RICI AMALLCSPKLLIADEPTTAIIJVSVOYOIWIXKTLOKKTGMSLLI ITHN 
MCWAETAO0VLn.YAGRMVECAPAV0KFHNPSHPYTRDIXASRPSLQPQ0lJCSFNPlPG 
QPPHVTAFPSGCRYHPRCSKILNBCSAEAPEIYPVREX3HKVRCWLYDD 

CPn_0683 765936 766919 

dppF-ABC ATPase Dipepcide Transport 

GVGCKTTNFPOPLIOATSLTKHYYKRSFWrOSKTIASRPVDDVSFSLYSRRAVCLIGESG 
SCKSTLAIALACLLPLTSGFLTFNGTPrKUiSKHCRHOLRSQVRLVFONPOASLNPRiCT I 
IDSLCHSIXYHKLVPKEKVLATVREYLELVCLSEEVFYRYPHOLSGGQQORVSIARALLC 
VPOLI ICDEIVSALDLS IQAQI LNMLAELOKKLSLTYLFISHDLAWRSFCTEVFIMYKC 
OrVEKCNTKRI FSDPQH PYTRMLLNAOLPCTPDOROSKP IFQEYHKDSEESCSTGCYFYN 
RCPQKOEACKSEIIPNOCDAHHTYRCIH 

CPn_0684 768056 767X81 

spoJ/parB -Chromosome Partitioning Protein 

EKSCDIVTEEISKarilEVAIODIRVSPFOPRRVFSNEELOELIASIKAVCLrHPPWRE 
rCTCDRVLYYELIACERRWlAMOLACATTIPVIUCHVrADGTAAEATLIENIORVNLNPI 
EMAEAFKRL IKVFCLTODKVAYKVGKKRSTVANYLRLLAZ^KT IQESLLQCO ITLCHAICV 
[LTLEDP ILREKLNEI I lOEHLAVREAEL I AKOL 1 3 EEC SS I ELKPTPLDMAESSKOHEE 
LQQRLSDLCCYKVOrKTRGSKATVSFHLOmrjOLOICLEAWLSSHGTLSESLS 

v:PiX_0685 768026 768217 

NO robust homolog present in Genebank/EMBL as ot 11/7/98 
FPpSQYLL I FPNR I U3L0AFEI LDVOCKLTDORKH lOMLHKHNS IE I FLS^WVVEVKLFF 
KTU 

t:Pn_0';8»i 768373 768176 

No rootisr noraoloq present in Cenebank/EMBL as ot U/7/98 

AKDSMMPOJRLFRV/OELFFFS^lV'N-VCEORRPRKLYPCLOHLNFPrEKPRFLLKGFKKEL 

HfYHMV 

'TIH^ liyporttur Lc.tl ucotttm 

l(KtHKJItJ^nAYRFrrTPl*:ri:;rHCKLVHNrWKKFY.':F.';CAIAtCIVt.\i:KLr;LKtV.':NTYK 
li:K>AKWi:;iLLLTPV\ALVAVr^>:FLP3KSALGJLE0AYHLrx;E3MKPYAi;FLA:;CFYlMN 
Ki-Lk* lAYYAt UAYtiUnOALOLPKP tOKLLKE tSEAOADOLVOVAL^iKJYOLLOTANntjPE 
VKri^:FLTt,a<V[ELKELUKjOV:.>:DFAALK:;SPLFHOFEnM'r:;CX;EWTLr;KRfr*KKG 

'Iti.'M.HH /t.n7f. /70U7 



.ILMLIVLAFROVFF-HSR.^O. .^NYLRLLKCNFA :rLrKtRT.*K ;H.::.MLrFrFASFS 
FYTNI FPFLEEOK I PAWTTVASRY t FG^AAyOU irSHRLKr::ETL*\F';CE I r JI^fMPFCC 

QNELx EMAKSPY ictAffxniiRNLiiMJBtirYLrrErwiS.iiHi u u: : ; > M Kffa^c^y'TCK 

3DPTSRKLAADHYPYGrUX3^rrrNRKLrrHfUYRLDTkPKO\%rr3L^^^ 
KSKQLYLKKQLPKR 

CPn_0b8O 771407 7701^7 

•/f hO -N i tr^ - l^cert Am tnnr r.in-, fer 

■ -r.TtiA V/Vv/F v-,v::-: • i •:: r^- •. '.-^lu -\ : . ' vm: t : - 1-- • **• vr.v.-TAi: 

HHAWL5WE r ACRRRCiLVKK I RVHDSGL ICL0DLEKLLNEGA0FV3 1 PKVSNVTCCVOP 
LOOVAELVHRYOAYLAVDGAOGAPHLPXOVOLWDVDFYVFSSHKIYCPTCICVLYCKXDL 
LDOLPP^rt«X;OKVAIYOHONPEYLPAPMKFEAGTPNrACVLGLCAALOYUXa.SAKrr^ 
DKEIALTTYLHKELLEI PGVEI LCPS I EEPRCALIC^^•I0GA^^PLOU;FLLOLRCIAVRT 
GHOCAOPAMERWNVCHVLRVSLCIYNDEDDIDQFILVLCDSLOKIRR 

CPn_0690 772704 77U36 

ABC Transporter Membrane Protein 

LSVLRGDKVLVS lETFSS r ASGSPVOKAAEACYTOYSKOPSSKEVLSSFSVaOELSLTPD 

RYMATGASELIKOHWLHNNHSLAFECILirCKYEPSLSQLPBCVirarOCARGSLSSr 

MCXIFDVNKH PLAFLMAVCSEDRCWIYIPEEMOTSOP IFVRH ISFPTVSDHtSVIFSPRIV 

VILC0RASAQI0rSHDVDLQWGSSKTIVTWrELrW3ECADL7VFMVPGYSEEr^^ 

TrATVEKnAICRKTONTIiSCOGFCWFDOTSYIVCKKCHAESLVLVOSPRKTWVW^ 

DAErrVSRONIKSILYSCHFLFECTISISSOGCLSDANOKHDTLLLSSEARVSTrPRLEI 

CTDEVKASHCATVCPLDPOOIFYMRSRG^^•EAEAOEKLIHG^LK0CLVSOTFlfiSSF0LN 

QTS 

CPTU0691 773467 772685 

CT691 hyxjothet ical protein 

RGUKMUCIKHU^ASC^roVKILDOFNLNI0PCTMHVIMGP^CAGKSTLAKItJ«a>E 

SSGEIALOEONLLSHLPEERSRAGLFVCFOMPPEZPGVra^FLRDAYNARRItWBSDZ 

SIOEFNTLLSTVLETYEYNATrOLFLDRKVNECFSGGERKIWEICC^CVXEPEKWJJ)E^ 

DSGL0VZ>AIJU.ICRVLEKYK£LKFTSSI£IVT1{NPKLGNLXRP0^^ 

SLMHELEAKSYOEVTKRVAWR 

CPlX_0692 774945 773461 

ABC Transporter 

lOEFCATGLKVMGESVlCVFLEEREDYPVGFVTPIESOGLTRGLSEETIEEXAALRNEPOr 

I lOFRLOAYRYWXOUf EPAWARLHYGPIAYDOrVYFSSPKOKKPLCRLECADPEZLDrFK 

KLGtPU)EOKRLLNVENVAVDLVrDSVSICTrFKEALEKACVIFCSU;EAIOB<PNLVKK 

YLCSWSHRENFFAALNAAVFSOGSFVYVPKCVKCPKDI STYFRHMCEWFERTLIW 

EDCXnrASYLEGCTAPAYSSNQUlAAVVELVAHEHAVIRYSTVOhlWYACOXlC^^ 

VTXRGLCAGYKSKISWSQVEVGAAITWXYPSCILKCOESVGEFYSVALTSSKH^AXm;^ 

KLHVGKRTT5TVI5KCISSDESK»rrrRSLVSLCKKAEHS5NYT0CDSMLZGKASGAYTOP 

KIWEMSTSSIEHEArrSKLREOOLLYLRSRGLSPEEAVSLVIHGPCREtlEQLPLEFAO 

EA5KLLLIKLD1SVG 

CPru0693 776292 775240 

TPR Repeats (0-Linked GlcKAc Transferase horoolog) 

LRSTNHVLCEISKESAAXHLAKEFirSCrNI^LSGEYEOAEKRLKETIXZAST^^ 

LGZIALETGRVSEAUACSKCLASEPGDSYLRYCYGVAI^ORGNQYEAXIEOYSAYVALKP 

DONTBCWFSLCSVYHRLXRI^EALrcFDKILALDPWNPOSLYNXANaLSEKZJDEA^ 

EV/AVAXNPLYWKAWVKljCFLLSRSKRWOKATEAYERVVQLRPDLSDGHYraxa^Y^^ 

TRUUJCAFOEALFLKAEDAOAHFYVGLAHLDLKOMRCAYBAFNSALSZNLEHEIUU^^ 

YUWMOCETDKATKELLFLOKKOSTFAFLLQICrWSOPSSHOFERRLDrrXS 

.CPn.0694 779635 776330 

pbp2 - PBP2 -t ransg lyco las e/ trans pepc idase 

FSDESEAHNIHSKKRPKKFPIYLSIAQKTNRLLSGIVIAFAVIALRLWYlAVVDaQKLE 
EAYKPQIRVLPQYVERATtCO RFCR l' U VVNQLQYPVSVAYCAIRDLPTWWRVDEHCHKO 
LIPVmKKYIKCLSELL50£LKt^R£AZE0AIHAKASVZX;SVPYLVAAl^EirryUCL»^ 
SKI>n<3LHN^WRRHYPOESVASDILGYVGPISL0EYKRVTOELS0LRBCVRAYEIX:£D 
PKLPBGIJ^IDOVRALLCSVESNAYSLrULVGKMGVCACWDSiaRGKIGKKPZLVDR^ 
FZ0QIBGAWEAPGTKL0LTLSA£L0AYADAUX£YE]CTETFlt5AKSUCKREICLPPLrPW 
ZK0GAZZALDPN^X;EZLAHASSPRYRNNDFVNAKVA£0SKAVRS5ZYRWLQaC^HZAEXY 
DRWL Z RERRNPLTGLCYEEZLPLTFtXFLDFLFPENSVZKLOUCRHSFVCOAinWL 
VTRLLSLFPYEEGTCPCSAIFOAVFPNEECHZLZOEVZSLOEOKWZMBCLNOHKADZEEL 
KEALTOVFNELPAfTfDKZLYTDIUlLIVDPERFSPVLPSEVHRLSLSEFTnXXSRYVVLR 
SAFSTILEDAFIEVHFKSVmKSEFLOYLAAKR0E£AUUCORYPTPYVDVLEEEICrR0YKM 
FCOEHLDTFLAYLFSKTPYKECLEPYYOZLDLWZNELONGAHRALSWHEHYLFUCERVSH 
LSEHLPALFSTFREFNELORPLLCKYPrSXVRNKRQTEODLAASFYPVYCYCYLRPKAYG 
OAATLGS IFKLVSAYSVLSOR I LWCHNEEPANPLVriDKNSFCYRSSKPHVCFFKDCTPr 
PTFFRGGSLPCNCFMSRCFIDLVSALEMSSNPYFSLLVCEGLGDPEOLADAASLFGrCEK 
TCLCLPGEYAGRVPHDLAYKRSGLYATAXCOHTLWTPLOTAVMLASLVNOCWYVPKU. 
LCEWaSEHVSYLSSKKKRTIFHPDAVVEVUCTCMRNVIWOOYCTARAIOSOFPPOLLSRI 
ICKTSTAES IMRVCLOREYGTMKMKO IWFAAVGFSDOOLSLPT r W I VYLRLCEFCREAA 
PMAVKMIDMWEKWRESFLRG 

CPn.0695 7R020I 781382 

homologous to CT695 

SLEVSMKKLLKSALLSAAFACSVCSLOALPVCNPSDPSLLZOCTIWBCAAGDPCDPCA'IV 

CDAlSLRACnt-GnVFDRILKVDAPKTFSMGAKPTGSAAANYTTAVORPNPAYMKHLHDA 

E^rtTNACFrALNIW^RFD^W^LCAS^^YIRCNSTAFNLVCLFCVKCr^V^^ 

3rXWELYTDT5FSWSVCARGALWECCCATLCAEFOYAOSKPKVEELNVICNVSOFSVNX 

PKCYKGVAFPLFTDACVATATCTKSATINYHEW0VCA3LSYRLNSLVPYICV0WSRATFO 

ADNZR I AOPKLrTAVUJLTAWNPSLLCNATALSTTOSFSOFMO I V5C0 INKFKSRKACCV 

TVCATLVDADKWSLTAEARL INERAAl IVSCQFRF 

CPn_Oh'.*h 7rtl703 7R2S'»'» 

cr^i'j f> hypot her i nu t p t or ui 

N:/:r^MrLLTY^NFE rnv-orXF-'-^ra :klt r KDLM.';A^:Arf F'liiorrnrtWNPKMKLY ifee 

KNr;LY 1 1 MLAJCTU.VU'NAt ,1 H I r<KV lyONKTVLrVT rTKKOAKt ;V t RIVXA I EACEFF £AE 
RWt/;tXLTNMTT FRN.'; IKTLDK fFKW .SHWAYf.TKKtVVALLAKKHOKLLttNl JKIRYMK 
KAK;LLVW0r::YEKtA'/Ahw\KKI/:t|-VI^\I.ViyrNi:OlTI'rfHrVU'i'NlJtJ::i.K::rRLI[M 

vtKENt I FAKiiKt*: It- t7::i vK:;t.tvi'i iI^:aki*-*:: vritiiL':uKtt4MmM.i AKKmitw* 
r*:! -KltmiMt. i.'ii K..*;rtu r;: 

wv' ;r I Ml :dft ;hcti jfTLKvr :'A;r :rr' r \ .lov -r •/ :rir.Ki lAVAT/t .uk i * :i a::*v :kkk»r 
tTKi>;i [AAKTOAM rpALi mp/r:n.i-MNhfAVi'i<|.:».^/::rii.uii)i i-KYKVum-tAuy^AA 
.:: vjijV'iA.: r/OFLKAVTM'Trvi :w nui: :i' vav k» rMtt: rrri f Yf n c a* :K*r/Ai;mi ^> r.:'. 
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TADSL*\KDI*\MHWMOPOFU:kE^PAEAIAKL tAJOCOCKPOr/IEKIVrCKLNT 
FFOEACLLEC Pfl KNADt-T E02L:DDFr;)CT.V;33VAI ECF r LWKIGA 

pyrH-UMP Kinase «t 
EP^«NMAX0TRRVLFK:3CEALJKD33^mrDEMRI^RLVSEXRAVRNNDI^ 
[LRCL.\ECKa.0IMRVr;ADOMCMLATL:NCMAVADAU<A£CrPCLLT^^ 
P^K.'^rD\LDCr.Kar'-r?nAG-p'/LTTDrrf:AALPJVCEifJVDV^ 



;-v;:F7:-V 



GPn_0699 



7fl4l79 794721 



rrf-Ritxjcome Releasing Factor 
TKSVWDTEW<MAAALDFFHKEVK3FRTCKAHPALVETV\nWYCr^^ 
UOLVISPYDCNNASAIAKCriAANI^PEVECSriRIKVPOT^ 

CPn_0700 795094 785609 

SShspthocyhotpaticy^ 

LECGrKTKTAWSKODDEOLLCCHOCYTNFKNOITSKUCSERWSSSI^EKCCXJSLHICR 
APGEJ^NTNPUJa-IAUiEALQDTLEREDYEOAAVrRDOINHLKTiaJPDDPS 

Cf>n_0701 785584 796672 

KpSoKTLPNDllLlTLyKRKESTO 

ITSHFTmiECFGEFrVLPUCETPLVCKETlXEHriXPYDLVGNPECEJUL^ 

INFQDHLVUiGIDFOGNVECTLDOLVQIXlSYXJlSKI^FAFSSErGFLTOI^^^ 

XF^HIPALLYSKEFT^l-IDEE:VEIITSSLIXG^m;FPGNIVVL^^ 

LRITASKI^AEVAAKKRLSEENSCTUCNLIUlSLGLLTHSCOmJCET^ 

DLGLimENHPLWTJPIJV0IRiUHIJ^K0AEDSW3lJ0KnriSHUlASVt^ 



CPn.0702 799700 786929 

yscC/gspD-Yop C/Gen Secretion Protein D ^.««.^»«««ArT 

LKXlWKTVILNrGRKIU:GIKKKKKKICIt^ClJriI)LVLLCVSSOR^^ 

RDEKIAACPKNSAASLSAKKSHTKm'PCSIPSKVFSKJmTODKTF^ICrSCSAFPA^ 

TLKEI^E3yaCPRPERJlTTADVKRSPRrLPTOEVEEPVPAXSK£OU5SIQVWrE^^ 

AVNAINLSIKKQLEECrrSTVTEKDVOPKTOATPHASKKNVASPSTSHPGIEKA^^ 

0DKSE2EKVKERLTKRELTCIDUCDNGYTVNFEDIS ILELLOFVSKISGTNTVFDSNpLQ 

FWrWSHDPTSVDDI^rLLOVU<MKDLKVm?GNNV^ 

TCEAVVVTRVFRLYSVSPSAAVNIIOPt^HDAIVSASEATRHVIISDIAajVDW 

AALDCPGTSVE^^TYEVKYANPAALVSYCOC^rumJ^nAfWFIQPG^^KI^^ 

LANKAEOlJLKSLDVPEMAHm)DPASTALALGGTGTTCPKSU^^ 

^}DIGYNLYv^^AMD£DFr^m<Nsr(>^L£v^wsmIGNO©wDRVICI^ 

YIEVLIUJTSLEKSWDrCVQWVAUIDEOSKVAYASGLlJNNTGIATPT^^ 
IPLPTPGOLTGFSDMLNSSSAFCU:i IGNVI5HKCKSrLTLGCIXSAUXJIX;i7W^^ 
RIMAODTOOASFFVCXrrVPYQTOrrilQETGTVTONIDYEDIGVN^ 
QIEOTISEUiSASGSLTPVTDKTYAATRWIPIXZrFXVMSCHIRDJCTTKVVSCV^^ 
PLIRGLFSRTIDQRQKRNIWlFIKPKVISSFEEXrrRVTNKBGYRYNWEADaaSMQVAPRH 
APBCQGFP5L0ACSOFKI lEIEAO 

CPXX_0703 791205 789685 

pkn5-S/T Protein Kinase 

RKIGFMDCRGGIPLPEPQVICGYHVKKII^KKLRSRVVHGLHPCTRHSTVrKVFSPSPSF 
TSRS\mJFUC£AOSU{OITHPNIVKrHRYGKWODCXY lAMEYIECI SLREYILAQFISLP 
QAIDI I FT>r AOALEHUiSRNIUiKDIKPENILITPOGKI KLIDFClAI^nTTEIORAHPS^ 
rCTPYYMSPEOROGESHSPASDIYAJUSLLAYELILCHLSUSRVFLSLVPERISKILAKAL 
OPSPN^mYSSTREFICDIHHYRMSGD^fOEOU^IKDHTVALYEOLQTQRFVLAPETLRFPD 
FISGVLYHOGYPLYPHAYimiXCDVFNLWLCYSPISNATIAIJVVKSLVCOQD^ 
DRVrtTEINECLIRKKI P lOEXII S ILCIZISKENKELSWI ACGKTVFWIKROGRVVODFES 
FSPGLGKITSLOIRCTKVAWEIGDEAVNCTLEIXESVASUCTLSLAaJOORROKAIFCPI 
ES IHOCIOSRQHCSNSPSTLISUCRIR 

CPn_0704 792330 791209 

fliN- Flagellar Motor Switch Domam/YscO family 

RYFHAVAADSSASWLKSRNNFt^SliGKTEEOVAAPEFPKEICOHKIREKFRLEDVOVSrK 

rRGSITAVEATKEFCVHLLIOPM\AA:PWEVENU.FLTS£EDLOELMVAVFDDASLASYFY 

EKDKlXCFHYYFVA£ACKLFEEXC>VPSLSAKVCGDAirTATSU:GSFOVVT)ISUUI^ 

rA/RCRI^P£OTF0SCQKFrSGUiDESDLHNIDC7rOOISLSVEVCYS0tT0EE>WVVPG 

SFIMLDSCLYOPETEESGALLTVOKHOFFGCRFLTPSSGEFKITSYPNLTHEDPPLPENP 

^ASAAPLPCYSRLV^^ARYSLAVSE^tKLNLCS ILSLGNH PAYCVOI lUXIAKVCRGEI 

lALGDVLGIRVLEV 

CPn_070S 793176 792334 

CT671 hypochetical protein 

FMELKKTAESLYSAKTD^mTVYQNSPEPRDSRDVKVF3LECKQTRCEKTTSSKG^lTRTES 
RKFADEEKRVDDEI AEVCSKEEE0ES0EFCLAE3JAFAGMSL r Q I AAACSAEAWEVAP I A 
V33 lOTOWI ENI I LOT^KV I SEINGEOLVELVLDASSSVPEAFVGANLTLVQSCODLS 
VXFSSFVOATOMAEAADLVTNNPSOLSSLVS ALKGHOt-TLKEFSVtNLLVOLPK I EEVQT 
PLHMIA3TIRHREEKOORD0NOKOKQDDKE0DSYKIEEARL 

CPn_070fi 79368^ 7'13180 

CT670 hypothetical protein 

vavar/ plepvwi kkdrvdraekwkekrrllei eqeklrekeaerdkvknhymokioo 
ijlolxdexnrcdavloi ksy i kwavolseeeekvnkokewlaaskelekaevnlakrr 
keeektrlhkeewmkeau;eearaeekeode>colcfolrokkkresgcs 

';Pn_07O7 /^SOIS 7-13704 

yricN Yop N (FljocUai -Type ATPuse> 

'/NMDOLTTDFr/ri>i:X)li:DVNLTTVVCR tTEWGML I KAWPNVRV.'KVCLVKRNGMEPL 

'/TEVvr;ra'i: TAFu: pu;eux:v J PS SLV r pTt^Lr-LH I RAGrK;Lu;KVLNnu:EP I w 
rt;T-\ijtrj\vT\'\- rFr(An'orutR/\KLROiu;Ti;vRC iccmltvak^vk ici facacvgks 
.:lu t AitNAKi-iAuvNv I AL u lERCREVREF t D^OL^F.L-' ;MKni:v r w:rr:;DO::iX)LnLN 

AAYV'rrArAh*YFl'L/.ii:KTVVIWMD:^rriIFARALnL*'/':LJV/VjEPt*AK/V:YTr':.-VF3TLPRL 

i.ER:yyv:;t)K{ rr r•l■AF^'T^•t.v.^. irniftNErvAOEVK.: r ixr :i i r vr j :NALAyAYi ivpa [Dvla 
f ::iti.LTA t vr'Kr-7jRf< i u ikahlvlakyk.xniimli r*. i'ikyI'W ;:;rKEiPFA r dm iDiciiJR 

FI.KOli t m-KTrr^f- DXiKiOLUA I FR 

■;f.„ «i7MM /•»',■/ »?: 7 ISO J 4 



^^^ISFKPNPMAicW '^^'^fJ'!^^*^:^^ 
ALlYFSDRDChmE3:jWFL)C/OYAV0RATCRA£LF.\JrVt^^^^ 



CPn.0707 



7OS203 '"'S74: 



rTtip7 hVDOchen '.ri I ^^y^^^- ^ _^ 

QV liwCK lESTRAXJw^iVU/WHITr LVAK PUJ 

CPIU0710 7')*J482 796210 

CT666 hypothetical protein 

RSRCEKSMATl«SCTAF0FNl<MUCVCTVVKCVQ0Yt,TEIXTSTa7^ 
0 1 LSOYMESVSNILTAWrEHITMARAVKCS 

CPn_07ll 796791 796486 

CT665 hypothetical protein 

TriNt«VU3FINYLYlJGRYSMFKKEN7AKEE}CNS0Pt^t£CCM0DH0RA0ELI^^ 
VHKUiAIXRECSDKESFCOQCSLLAGYVALOKVLGRINRXMI 

CPn 0712 799315 796781 

FHA domain: homology to adenylate cyclase) 

MAVRLI VPEGPLSC/t FVLEDG ISWS IGRDSS ANDI PI EDPKLCASQXI INKTDCSYYIT 

^a.DD^rPrVVNCVArOETOLJ(NEt7rIUX;SNOYSF^OEFOPODmDFDIPEElIFS^^ 

SGDLSDSNEOGKDI^PRCTTSrrTWSPKPKEKLTKDOGSSDPITSGOQELADAFLASAKAE 

KNOPRAKVAKXCLK£SSNCSLNPK£0NAKDSPKCEERT^n<PO^U^ZM£DNG^ 

SAEPSLKNrAROETPLKEJnCP\raCAWaCATPDSPEKlCDOPEBCSKKEGSKIEATPLDSQ 

KESEDKEAEEAFVOEEEaiLTECMCEOSOSAAOANDETTASDHTAEDNKErPKICV^^ 

Vl^PFHVODLrRraTrirPAEtDOIAKKNISVDLTOPSRFLlJCVLACANIGAErHUMGK 

TYILCTDPTTtXtVFNDLSVSHWAKriVGtnXSCILIEDUJSK^^ 

TLPASSriLTUWnJUU^ICTASLFHTKENA^LENIOYOnJLAOVINOFP^ 

KrHSOLFLIGHVW<STOKSELLYKVnALSFVKS\mONVIDDEAVV<CEW 

ISMHSPEraCFIITCYVKTEEOAACLVDYUJIHFNYLSU^OT^^ 

CCFANIHVATVWSEVTLTCYVNNDOAEKFRAWOEl^IPGVRLVKNrAVliPAEEC 

L^rtJlYPNRYPVTCYSRYGEISrNVVV^CRILTRCDVIDCamr^SIOP^iA^FLEKE^^ 

IDWK 

CPn 0713 799817 799332 

CT663 hypothetical protein 

LDLKEEIcisFRNEIVSIPCXnTCrriAALENTSMLEiarKNFATYWJlTSTL^^ 

LPISEVVK^mAQO^^ADNEI^a-SASuyu^PSAD^AKLYU)^mIC^^ 

EiaJVVKVRRrSGOTTYDDFVRHVESFWJFSETWLSDLCUSKQ 

CPn^0714 80112S 800091 

hemA-Glutaittyl tRNA Reductase 

NYRrVLMVLCWISYREAAUCERERAIOYWSFEKNLFLAORTL^ 

ELYYYSESPEIAOAALLSELTSOCrRPYRHRCLSCFTHLrOVTSGIDSLIFOTTEIQOOV 

KRAYUCCSKERELPFDUffLFOKALKBCKEYRSRICn'DHOVTIESW^ 

TImJVCVSDINRICVAAYLYQHGYHRITFCSRQO^^APYRTLSRETLS^ROPYCVlrTCS 

SESASOFSOl^CESIJ^IPKRIVFOrW^RTFLWKETPTGFVYlilDFISBCVOKRLOCT 

KEX;VKKAKlXLTCAAKKOWEtYEKKSSHITOROISSPRIPSVLSY 

CPn_07l5 801636 803462 

oyrB-ONA Cyrase Subunit B 

KnOCISHMAAYTEASIt^IJ^LDHIRUWCKYIGRLGtCSOKnCIYT^^ 

FIMCHGKSUCISASDXOISICtXXniC I PLCKLIDCVSKIirrcAKrroW^ 

LKXVNAI^EirSVRSVRKKKYHLATFHRGVLOESKOGSTKDPIXrrFVSFTPDPSIFPErr 

FI^FLKDKIROYTYLHSGLEIRFNDEVFISHNCLICDLFOAEITEPPLYSPLFFONEBLT 

FI^SH^.BC^rraYFSFV^KK3ETIX0CTHLTAFKEMVKCVNE^FCmVS^ 

IAIKIASPIFESQTKWaX:^mJIRSSLIKOVKEA^^^AU«CDXVW 

KNIOFIKODLKSKOKKVHYKIPKLRDCKFHYNDRSLYGEASSIFLTBCESASASItASRN 

PLTOAVFSLRCKPM^IVFSLE^^KKYKNDELFYlATALCITO^^EIOHUtYN!CVILATn^ 

DCKH IRNLL ITFFUCTU.PLVTWiHLFILETPLFKVRN)CrmYVYSEOE»«Al{QfCK 

KDSSlXITRFKGLCEISPKEFAAFICPEIRLTPVTITSLESISSIUJFYKatNTKERXOF 

IKIMLITOF 



CPn_0716 



803466 804902 



gyrA-DNA Gyrase Subunit A 
mRDVSEl^RTHFlWyASYVILERAIPHILDCLKPVORRIXWTLFUffiDCKHHKVANIAC 
RTMAIiiPHGOAPIVEALVVaANKCYLiaiWNFt^PLTGDPHAAARYrEARLSPLARETL 

fktdli afhdsydgrekepdi lpaklp\rt.uj<cvix; i avcmttkifphnfaeujcw^ 
l^rokk^^vfpofpsgglmdpseyodgu;s itlras i di indktlwko icposttetlir 
siellaakrctikictiodfstovphieiklpkcsrakemlpllfehtecovilyskptvi 
yenkpvecs ise 1 uvlkttalogylekeu-lloeoltldhyhktleyi f i khklydsvre 
vlainkkisaddlhoav^alepvrt^elatpvtkodtsolasltikkttxrfneeactkei. 
laiekkoaaiqkdljrike\tvkylkgllerhghu;epjctoitmfictaktsilkqotli 

CPn_07l7 8D4^»68 WSJOb 

CT^?6 hypothec :cal protein 

IR IKFIITTITrVVRMEPRH IV XRKPETPKAPDVEKPr^^pEYMT^t^^^•^TFeCPVKTLTO 
RRALl EORCAEECOKKYDNF IvJS X LI STFCL-VliKDMDPAOKASKRHRSVYKBO 

CPn_07li{ RO530n 80562*; 

CTfJji hyporhcticil protein 

RAm':FTYFLALPVPRLK0ERFLCCrKRWAPFmSPLYLTLIADHDrrr\'LAKNU3KFPLP 
VE'>ranVtjr/.':3LLKJIFU:r:0LSJUlLU\L'TKFE I LTUiOLYi:AON t 

CPri_*>7i> Mt»»;**77 H»t>M'»n 

;;frili (iv:ii«lriui ulim» :*.vnrh.v.:<*» 

FD fKt.VKKV I K I ::MK-TVT:;FTVf;K Wr-f IRLUKYLTEVItr-KYCRAh-Yyi-II t UZf ILWQ INOO 

itrn'7An«ir':tit\T tDf ot'KF.FiJ.ELunw ri U/rr/ Eiy;M t i.v iNKi'RWiWHPAPG 
MKtK;rr,viiAi.iJii:u:Ki'j.KKi;FPKKi'Wkh : i ynr(ij>Kr/r:y:t.i itaktuoakkvrielfs 
TK f'( .kk: :vr -va : i. '.k i k: rrr 1 1 m i r : K>mt* kikt/.'::// :k ij\vt» » vvi akm^klsfv 
Ai.;:i-i:n:KiiM^ijiviwKHUTrrtt/:i«w;iP:w^':r:Y';iXKfAMJiAV::vi>^^^ 
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TLLV.T/A;;PrftAn»V3LEIMEEK 

';pn_07:: Rn7^7l ^OHAHO 

ndsA-KDO JyarnecJS'? 

KRMVMFWNKM i L I AGFC/Z £CZ" ITLZ I ACKL03 IL-^PYSCa lOWFrf^^TrDKANRSSLN 
.^FRCPlLTEGLn rL/\KVKrrr'r.r: :LTCVKTPCDAyAAAEV*CN:LCV?AFLCRCTmVA 

TA T vwr y ^^ pwr><Fn r : f [K-/r .: n^rrr nr : : T rp'^rr: PT'? IT :: //r; 

"rnu. \.\ ": ;*. .;. :: :.mv. a 

CPn_0722 <109477 ^08974 

CT654 nypotftecicdl procein 

VGLSKTKFLYCSLr^St^LU-^rGTKVA r I0VIX3 1 CDV3CMNKHF0E3 PPFUCIKKV^ 
SKOICSPEERFFHCKIDKSCMELHFPOSSYSCKEYLTRISGHILTONFEKOMOniCNSGL 
l^ODCSLHVYCCRFOVDPVPCYGSPDKEDSSSGGMKTLYLSLFRN 

CPn_0723 908973 809703 

yhtaG-ABC Transporter ATPase 

ASMPILSVCNLVroCYNKXPVTNDVSFOTNFCEIVCLIXSPNCAGKTTAFYLr^ 

KI I FKNVt3^TKIaMDHR^RLG ICYUOEPTt FKE1.TV0D^^.IC ILXr I YKARKW^ 

TL\^DL0LG3CUiKKACTLSGGERRRLErACVtja^PSVU.IXEPFANVDPLVIQNVKYL 

IKILACRGIGILrTDHNAKELLSIADRCYLirDGKIFFECSSSQHISNPMVKOHYLGDSF 

SY 

CPn_0724 810602 809706 ^ 

No robusc homoiog present in Genebank/EKBL as ot 11/7/98 
RTSTRLDYRSGCILSKILPFPELWWUjGFI/rDCPCASVWIAAVANCYDSVFMSRPEHKP 
N I PYITKATRRCUlMmAYlASLKDAROLAYDrUCDPGSlARlAKALI APKEAM 

f fyccsniedileemrrphr illlgfsycokpjo^pecrfndacrydpshptcascs ict 

*wruwirywi:ptfidiakhlhtlkkrypcyoiiJ'avtacelsu<mfgdyasv^ 

gvgiiu-tcricntfkafklj^gvkpcvtiixedcfdaarilteyssapfprdfceih 

CPn„0725 810829 810587 

CT552.1 hypochecical procein 

SCGIMMrFAPLLYESUlfiGUffiPTSHMQ00IJVRIXFIira3LTT£rXHVl^^ 
GLTTIKAIAEEVLSDDEPLLD 

CPn_0726 813384 810880 

CT620 hyrochecical protein 

AOIDMIYSTSISTFYKKLSLVSSMHSFAORHRESt^IANYEKTTAERDILKRLIEVLDQ 

RA5£RYRSAVEKUaCYEVERAWAKS IPVAAIHEKPLSSTHASVONH'ASTPAATGSG^ 

YY>WVKOKWAQDLIVEUm/Wn'IMASVNSKNPANKDVn)KLm^l/^ 

FQTLYNFPEEIFTAIQRArrrFTGGHKTDFTNOU^GKYGNQATLTQTFAIXIRra 

AV0GVLTPE0FTIFAEIATEWAIJ^H\roNrOEAGU)RIEnACEKLAAVrNSSDLTRNDK 

IMFCOKITDLYSDOVAAX£SFCTVUlASIYVNQHQC?mFSNLSSrVGSLIGT^ 

SOGOISSAAIJUIALtjrARGLNSRFNELTAEOQKLINECIKSLVTrKCCEHljGAIWAYFTA 

STWALNPTA'mDHVKAAI LEEAKELDNSSFOLASS rKSA>frS rVNSSGSFSVTVNSSTL 

QYTIYSEKNGK\rtINOrLLr«GSTGFLPErTKLAKTNAESTARSYFTlFKAl^ 

NKIEDLQSOLQOnTWKTELrDGOLl^ASEUlALPU»SAVASVLIDRYKPKEVDY^ 

YKKLYYSNl^SIGNSIIDAIS0YVNGATYrNFASYVGQQPA\raW3GANA7PGS0ESAQA 

KLtWERKOAALYLOETRGALTVIEEORARVLKDDKITNEQRSTILDSLRNYEONINSISG 

SLVUjOrraOPLSIACGSVAGTFEVKEGOEQWQAIUXJILEEALVSGLVCNMINGGMFPLO 

STIQSDCX}SFADMGQNFOU5L0MHLTS«QOEVIWATSLOU^0MYI^rjWSLTC 

CPn_0727 813559 816192 

CT619 hypothetical procein 

KYYIJ-SMSTFSIONRI^TISCESTRI IKLOHKYSGFDPRSWAlNLEEtNSC I YALRHLM 
^lALOSE^r^WAAUJ^P^^IFPTrSW^DYKHSRPOASSPRAPSSaTPTDIVSAAAUa,V^ 
VI DGGt^AELVASVTEI DLCALSTISTVROLMASYLGLTTLTAEQEKVVrSSSYVPSEKNL 
LEHVKOEKAAEIOAKOEEIKAVtXAKCVSTEEIEAIUCEYPDIYAADFFKErrEEPUrrY 
RAKVCAPIOEIWENAIOtXPTPPAITPDWNEVbCMNTLSTILOAIODAIKOAPALO^ 
EirTrLOTLVPLVDKTTrrKAEFDLIYTATOLPm-ASLKLYLTDROIAEYRCKITKVYON 

s ronlsetkrvve^^^mshletolsmfwao^rcfvtwisoanau^ia^tnkyisavlttsm 

emygci^lsymyeruusdocai fdksvneylpih ivvggswvngwiakmaayoeia^ 

lctavtsodoikaywtrcnefkatrhffhnicdowofanetvfcnclttangaiopdl 

ggfirea^f^nvctveadyvs^worilnef^^'aatahvl/dl0l0raeujkkaddu^ 

ftenrkfavaawitseslgdalismil>isolpkoeafij(plieeinf^i^^laanalnsllo 

ittiefsttswslssylvosktconlfacdyyetu-aaarereyiyrdtarckoainlv 

nguiokinslpgatsaokoemlnattyyoyslsvtl^ltvlesujw^lkkrl^ 

ydks\^kiesfodwipti-v^esfltsgfpnisatcgi:gplftovosdootytsogctqo 

lnloncktt looewtlvstsmovlng i lsqlagai ysn 

CPn_072B 818483 816525 

CHLPN 76kDa Homo log (CT6221 

VFW/NPICPGPIDETERTPPAOLSAOGLEASAANKSAEAORIACAEAKPKESKTDSVERW 
3 r LRSAVNALMSLADKLC t AaSNSSSSTSRSADVDSTTATAPTPPPPTFDDYKTOAQTAY 
DT I FTCTCLAD I OAALVSLCDAVTN I KDTAATDEETA I AAEWETKNADAVKVGAQITELA 
KYA3DN0A I LDSLCKLTGFOLLOAALLOSVANNNKAAELLKEHODNPWPCKTPAI AOSL 
VDQTOATATCI EKDCNAtRDAYFAGONASCAVENAKSNNS tSNIDSAKAAIATAKTQIAE 
AOKKFPDS P t LOEAEOMVIOAEKDUCNI KPADCSDVPNPCTTVCG3K00CSS ICS IRVSM 
LLODAENETAS I LMSCFRUM I HHFin'ENPDSQAAOOELAAOARAAKAACDDS AAAALAOA 
OKALEAALGKACQOOCI LNALGOI A3AAWSAC\»'PPAAASS ICSSVKOLYKTSKSTGSDY 
KTO ISACYDAY KS t NDAYCRARNDATRDVINNVSTPALTRSVPRARTEARCPEKTDOALA 
HVICCNSRTLCDVYSOVSALOSVMOIIOSNPOANNEEIROKLTSAVTKPPOFGYPYVOUS 
MDSTOK F T AKLESI.FAEC3RTAAEIKALS FETNS LF tOOVLVN IC3LYSCYL0 

::rn_072" 3l'.»905 rtl8592 

CHLPN 7i;k0.i HiMnoUni (CTti^J) 

PAWnnv^ri'LN I DTKDTMKkVVYOWLA:;VVLLALT tL*CYAELr-U;EOKVKi;KrnTLDEVK 
DYL':Kf<f;FVETHKODCVUlACDVRARWLYFREDlKrJP::DKDKYNrtPV.NnYI<.';EFYLYI 
t)YRAFJ>NWI .r^.'^tMNWrA [ Ai «'F^,\Ai':VD [NHAFLGYRFYKMPLTRTDFFME rCft:;GU:0 

I ^••E::Evor'0: :nflx :ui t '.-wtr el;;kpv pyov t vi ir/j f'FWNKTKKn yawwd ; i lnrlpk 
FVKcr.'.wLWNTvvrrtrrr: rrTEK/VvrNAMK YK Yi :MM\Mf ;ki i: vvi w t mt/jkk plyly 
• y\ri f<ni -I -^KATKT^t.^x;K [-.ni j^wi* (i* rpLry :lr ka' :i m: :A*r/i(YKVvtiAU';vPE i d^/ca ; 

I' :nr \ni .1 .KI-VKAOA l/V\fA'prKF^\M IfTNYKt ;k: ialyhy. : rTI/; :KfiAY{;AY.';KPANnK 

(j;:;rjFTn'KFiHi:( t:;Af 

mviN Idtt-^ir.il Minil)i.tit».* I'l.trfifi 
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t4 

UIFRTVFFLPK lUyiLlLZ;.. .rHF£rt,RACJU:RA#\KFFFPF.:RL:i'-JT:iFTU.IE 
AVLWWUjr/EECr/tM : «:,T?1 : LLF^TG [ FLMm-TA/WALLH^TENKFrr/^lJVPVVW 

w [ FFv c AARHsopaER':':oi^i/ALvi[<>i^jF;Evc rruint^MKFU^Alici^^^ilf^^ 
APL^LC iltg:: ircuc::^3rciv^RYVKEiGptYW3LKiYCCFTftt:^;Ftm 

r3RCVOREDHERCULHKrVL7LTMS\.>1EIMTAGLCLiALPGVRVl.YEHGLrP0SAm: 
VRVLRCY'3A3 1 1 PHALAPLVC^'LF^'AOROYAVPLF I J tGTAZ-VJIVlJLV'LCRWVtJCDVS 
'i: -YATSrTAWyCLYfLWfY.TSKRLPMYSKLLWES IRRS [KVMCTTJOACMITLCLNILr 

7Tr'n/rrLj;pLT?r^v/rLn;::TAOATAF:.rR::crFtJVF'wFnFAKLXRVE0LirrtASFEYw 



CPn_07 3: ■'iil4J4 'iJUoO 

NO robust nomolog present tr. Genebdnk-EMSL as of 11/7/98 
VAI AISRNI P"/! RLCr/PDNI LK I ERAKETSLSFLLIKPrSPPPLKCDVLFOISPYTSSE 
17 IGCSYnONKASLCSSTLRLRS IS 1 1 S 

CPn_0732. 322092 822976 

n£o-Endonuc lease IV 

NFHKVLPPPSI PLLCAKTSTACGUCNAX YBCROIGASTVQ I FTANOROWORRALKEEVIE 
DFKAAUCETCt^IMSHACYLINPCAPDPVILEKSRICIYCEILa:iTLS:SFVKFHP«GA 
ALKSSKEIXM^IVS3FS0SAP[^SSPPLVVUUETTA£WrLrGSNr£EUrn.VQNLICN 
QIPIGVCVITrCHrFAAGYOrrSPOGWEDVLNEFDEYVCLSYUlAPItLOT 
HAPUJECY IGKESFKFIJfTDERTRKt PKYLETPGCPENWOKEIGELLKTSJOWOS 

CPn_0733 923739 823101 

rs4-S4 RibosoRial Procein 

CLKYMARYCGPKNRVARRFGANrFGRSRNPLLKKPHPPCOHChfORXKKSOYCLCLEEKOK 
UCACYCMIMEKOLVKAFKEVIHKOGNVAOMfX.ERrECRlXhMVVRJCF^ 
HCKILV^K;RRVDRRSFFU^PCM0ISLK£KSKRWSVKnALESKDESSLPSVrSU5KTGFK 
GELLVSPEODQIEAOLPLPINrSWCEFLSHRT 

CPn_0734 823863 824915 

yceA 

OrnrEHFSSNGmXX:NYFODYVRVFIKEKKYYALAYYYITR\T3OTHEEIAl^^ 

DVSC^lIYISEOGI^COFSGYEPHAELYK>^LKERPNFSKrKFKIHHIKENIrPRlTVICYR 

KE^JU^LGCEVDLSK0AKHISPOEWHEKWENRCLILD^a^NNVEWKICK^^^ 

REFPEyAEKIJ«3BC0PETrPVM«YCTOIRCELYSPVU.EKCFXEVY0LDCCVIAYO^ 

GTGKWLCKLFVFOORLAI PIDESDPDVAP rAECCHCQTPSDAYYWCAOTTCNALFLOCDE 

CIHOHQCCCCEECSOSPRVRKFDSSRGNKPFRRAHLCEISENSESASCCLI 

CPn_0735 825680 825003 

*uridine Kinase (Uridine Monophosphoicinase) (Pyrlnidine 
Ribonucleoside Kinase I . 

GEXnO^MLMMI ICITOGSGACKTTLTONrKEIFCEOVSVICQDNYYKORSHYTPEERAN 
LIWDHPDAFaJDLLISDIKRLKIJNEIVOAPVrDFVLCNRSKTEIETIYPSKVILVBCILV 
FENOELRDLMDI RIFVDTDADERILRRMVRnvOEQGDSVDC IMSRYLSHVKPKHCKFIEP 
TRKYADI IVHCNYRONVVmi LSQKl KNHLENALESDETYYKVNSK 

CPrv_0736 827731 825992 

ygeO-ECflux Procein 

RGELLKLAROCLVAFWTVSVKICKSFRALVTTHFLTI INDNLYKFLLMTLLKaCTLTBlA 

KILSCVSFTFALPFUXAPUVGSLADRFOKRNI ILATRriEILCTILCTYTmOSWOC 

YVVLIUlACHTTIFGPAKLCILPEMLPSEOt^OArCIJfTAATrrCSruaClJ^^ 

HRI/;VNSYVWPTLMWIVSIISTLISFCIRPSNVKNVKOKITLVSFKDLI«^^ 

YLTVSIFLCSFFIilGAYTOLEIIPrVEFTlJCYPKJrtCAYIJ'PIVALCVCTCSy 

GKDIKICYVPLAAIGLAt.VFMGLYAFACS ILFVLFFIXALCFLGCVYOVPUUYVTOSP 

EHKRGOIIAANNFUJrFGVLVAACVTRVLCSNLCI^PErSFFYrGWrVWVSIWrLHI^ 

EHVYRU^IILRROUrnrLKIHOSSSPKCyrVAVOSYR£tRRVtJUU.TKTVRSRVIl^ 

0KLVFGWRAWU*SVOnnTWSSVRIWDSEA0DAWAVL0ANHLKTSLKKFPDVSVVC^^ 

KNVERfTSILOEOGIDLHPIOLVOKECKKRVIYTLVFPHA 

CPn_0737 827469 830756 

•recC-Exodeoxyribonuclease V. Gaama* 

KRSAKLPASCASKRKCRAKKKLTOERIFAFSVRVLPSNRKNAKRNLYKLSFI IVRICCVVr 
SALNDFFLTEWKNATKHCRASFSNSPRHUAOLAEDITSTHOKPFTKRVrtLVAIWTrCH 
WI KNOLVHVLSDH I FMCSTI FTASDS IVKHLFLCSGCSOPNI PDYLTLPLLINMILEEIS 
KASKFENCREFLSPPTYErrXKLAAAFKOFHTFSORPTKNASHYOELFOlLESHFSSYEE 
MFTTILNNRTOEEDCSLHIFGYAHLPKHLAEFFINLSTYFPVYFYCFSPCREyFCOLLSD 
RAIDFFWWLPDSPIKNAWEHYVl^DROALLANLAHKSOSSONFFLDREXDYOEMFLPSK 
HDSSLGVIONS I LDU PTSPODFSCniCOT ICI YRAl^II PREVOEVFCKVTELLHRGVSPE 
EIFI LSSH I ESYKVHLNAI FWPHVPt YFTOEVDPRAEDLRNKILLLSSILOTOCDLHYIL 
OLLTHPOLOOPIDONKVPYLtKKLSSEWGKISSKDRASGOOMKALCDLlLEEYPFHOBOG 
RVSOVEVWETTVPLIYFIOERINLYLSSSOHSYEDLrONVFSCLEKIFVLSPECrSrm 
LRNSLFPTFilSSCSLLFFTOrCLDFLLHFHKPSPLYDKPCPYIGSLSSLSLIPKCyvri 
WANKTTSSD I FDU^RTTTHEELAFSSTEDEENFHFLQ ILVSTKHELHISYISSAAOFN 
LPSPFLNHIKETLDLPV'ETLPTOPYLSAFFKNKACLHTSOEYNYStJUiAFYSKKALLPSL 
F I PTVK0VNLP0HL3LNEI IKG rrSPLDLFLKTWYNLRISYPEHLKKOOKLTPTKHOIED 
FWNECFVDKEHDLI P^ ISPHAEELFTYYREKTltXRMCLDKOPKHSPVTVTFSSSIFEER 
PYHEGYLFPPLSLSFOCNPVOIHCTIHGVOIEGLYLCSIDPROSUCICTTRTLCSLPETSS 
EOKOLLERYVALAVLOMSOHt^SOSALIKLTSniTKENHHPPFSDPBCyLRICVLEVYHLM 
5S0PI PLLS rLCWKTLDOEEKFHOAVLSAr 3 EEAKNPSLP I FWOFKNRN t EEXLNHVGA5 
ERLKILSLFRCPCEAV 

CPn_0738 8)0710 833895 

•recB-ExoctooxvMlbonucle'Jse V, Bee a" 

KFYLFSEVrVKPFN I FOSNSS lOGKFFLEASAGTGKTFTI EOIVLRALI BCSLTHVEKAL 
A irmJASTNELKVR I KDNtJyjTLREtJOVVUJCQPASLPr/LDIfCNWOIYMaVRNALA 
TLOC>CLFT IHCFCNFVT.EOYFPKTR L I HKNPALTOSOL*/LHH ITWUCODLWKNVXJQE 
OFHLLAVRYN ITUKI rilJLVDKLLA.rrTOP [G"/F.':nRVERLEO 1 3LWH00 1 YNSLLEIP 
KOVFLDOLTAH I .TCFKKOPF:; r LCCLHHFVDLL'rrGETH33LFSFFK I AETFNFKHRLAR 
YKrCAAFTVLENM:nA*ERTr.r>*aiLDn I FNTLLVDLOEYLKONYTWLtfPDESVTALEKL 

i^r;nEAorvvvAijiFA:Yot.vLrDF.For/rDKW/'.':rF.^riLFr3rKFTn:;LFLicDPicosiv 

bWH JADLr^YLTAK::.■:^;;EDKOL0U/^I^^/nrrrr•KLMEA ItIO t PCK I ::rFLEI PtJYLPIEY 
IIAlWlO-:;CTFENrniAI'I IIFKFYrrr I KOrjAt.WtFnEALPLOKEOK rrUWMWLVSOSN 

OAFELi :;yat r pv::f::knk:: r Fiii.TE-ni t i.TTAixEAX Uf PErr/EK it.F.':;:LFCu:t 
i)h;vrTKKEPrrtYFo:WJi:;Yr-.fiiK;tJArw/Krrc/';tr/u':;::r'p(;ni.[rv£^ 
MTrir;::YrYMOLUiLKNf*:;R-h :t'vmr:i:i j\ r;;::Y::EOurrLKiTT rn:;::KiU£YDivPCPC 
t EKr:KKNK2:::;EU Jif>n-VA( -rr'AKroi.YU - 1 :rroPt<:tJjP.i:t:ALrwvKLEnTor*iiAyD 
i-\iMuryFJ(POLF:;v:U.t-KOii(:)iAnvrjjri-LLETFAt.r/TPPK-rii'\sK:;:rrKFUj^ 
^)::u:IfPY::KLr[:;KCOUH*;^iKT':^.^HKtr>::;I0^^^*I.I/X/^EYIi^:TtMKF^KHTIlLm 
KEiT tLKU^:K-PKF:-rt.TF:::xrrFr:i c >jvi.HtK t Frcrrrr.TLFLEwjKiKx v cdlfpehe 
< ;k YY t inwKT;:Fti :h:i*N::DY:;Kr:iu.:: i y i kokkcxw ;i' f A/KAvrf KFLNOKF.ionDVEL 
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'TV Ih** hvDirnoc icj : pntein 

r:ir/LFKUCY.':LRNKKTK rrv-r : t lALCr r fCET/YDK tR33FV3LHVKFFPKIK0 

AP.';.':mLANL£L£NL'.T.K EHVA3LE£KUC:.rEV3rmTP PLFPE I LTP YFHKLVECKVVYRD 
•rTKWSSSCWyNVCKTI I K P^yL3CNVUT:L*/DYVGEH0SR I RL ITDVCHKPSWAMR 
'-.C rDnVMIKH-ILREL t POVEO C3HAY TLEKDKYEK I SCL,OEl-D3L;0r;EGE>;OAIXRCIL 

CPn_0740 93<>054 334864 

cyrS-Aromatxc aa Ammocransf erase 

SYMSFFNHI PTFSPOAI LGLCNVFFADKRPEKVNLVIGVYEHPOKRYGCLSCIRKAOTVI 

GKNnrVPEOTWSNHIRrFSOECLEVIRYPYYSKEQKOUJ-EPLIAFLKEVEKNSVILLHGC 
CHNPTCVDrrEDMWKEJUIU«ERELI PFFOTAYOGFAHC r EU)ftKPIE IFISEENTVLV 
AASSSKNFAL-rtJERVC-rFAVHSTFTDELVXIHSFLEEKrRGEYSSPORMJVEIVSTILSN 
PYUCEEWSEI^FIRESLCKKRTRFVOALRKVACHTFDFLLSOHGFFAYPGFSDKOVLFL 
REOHAVyrrAGGRKNLNCtTEKNIDHWOSFIOAYEL 

CPn_0741 839383 836185 

greA-Transcripcion Elongation Faccor 

EY I FRLKTGC r/CYL£KLOVL I EEGOSANFl^LWEEYCFNDWRGRELVEILEKVKSSSL 

ASLFGKtVtrr/VPLWEKIPEGKDKDRVLQLIUJLOTSNSOMFFDIATEYVNKKYSCEOJF 

KEAUlVVCUUX:RDF0FSLSRFDFU(HMHKGNryrHOCGV<;VGEVMCVSFUX)KVLIErE 

CIMSAKOISrrrAFKSLTPLSCDHFLSRRR;DPDGFEArAKENPIEVVEILLRDLGPKTA 

KEIKDELVDLVr PEAW^WWQSAiaKIKKCTRI rSPDNPKEPYVLSDACCSHMOOL£RK 

LGL5LNSA£K:5UYHFIROIJ{5£LKNIEIRK5LVKALODU}VCEGNK5LILOR£CXL5E 

rLGIia}AS IDKEYITSLSH2I7rSRU.ENMPrVALOKSFLSLVRKYSSn^QOVFM0ILLYT 

TSPTMRDFVYKTIKNDPSSVtVIJCKRU^SAHOP^mFPEIJVW^FUCLG^ 

KE\aJlLFLESALNFMYOVASTPHKELGKKrJiHYLVGORYUVVR^ 

STKCPOFSSSDUWU^SLAEWOPTUOCHKSNVEEeWLWSTSESrSRMKAKtOSLV^ 

MVt3NAKEIEDARSU;DUl£NSSYKFALEKRARIX5EErRVLSEEINRARILTKDLVFTO 

C\raCKVrrJCGDACEVVEYTILGPWDADPDSC:LSLOSKIJ^ONHLCKKl^^ 

ISRI05IWEEHGA 

CPn_0742 938442 939888 

CT635 hypochecical procein 

TKMKVIVMNSKSAOKI IDS :kC ILT I YNI DFDPSFCSSLSSDSDADYEYLITKTOEKIQE 
LOKRAOEILTCTGMSKEOMEVFANNPDhn^SPEEWU^^KVRSSCDEYRKErrENLINEITL 
DLHPTKESKRPKQKLSSTKKNKKKNWIPL 

CPn,0743 838956 940362 

"nQrA-Ubiquinone Oxidoreduccase, Alpha* 

IFMKITVNRCLDL^WSPKESGFYNKIDPEFVSIDlJ^FOPLSLmCVECXJDAW 
lAEYKHFPNTYITSHVSCWTAIRRGNKRSLLEVI IKKTPCPTSTEVTYD«?TLSRSDt^ 
EI FKENGLFALIKORPFDI PAI PTOTPRDVFINLADNRPFTPSPEXKLALFSSREBGFYV 
r7VCVRAIAKIJ^GIJU>HIVFRDRLTLPT0EUa'IAHLHTVSGPFPSGSPSIHIHSVAPIT 
NEKEWFTLSFODVLTICHlJlJCGRIUiEaVTAIJ^GTALKSSLRRyviTTKGA^ 
LrroiSI»IOTLISCDPLTGRirKKEEEPFLGFRDKSISVLHNPTKR£tTSriJlIG 
TKTVLSGFFKKKRTYTNPDTNUrcETRPIIDTDIYDKVMPMRIPWPLI^^ 
NELCrLEVCGEDFALPTL:DPSKTEMLTIVKZSI.IEYAKESGILTPHQD 

CPn_0744 941387 840389 

hemfl- Porphobilinogen 5/nchase 

EMSSLTLSRRPRRNRKTAAI RDLLAETHLSPKDLIAPFFVKYGNNIKEEI PSLPCVFRWS 

[^UIJCEIERIX:TYGUUVVMr^PI I PDDLKEAYGSYSSNPKNILCHS IHEIKNAFPHLCL 

ISDrALDPYTTHGHDGIFL^CEVL^mESVRIFCNIATU^AEMCADIVAPSDMMro 

RSIO^OSGYSKTSIMSYSVKYASCLYSPFRCAI^SHVTSGDKKOYCMNPKNVLEAL^ 

LDEEEGADILMVKPAGLYI£lVIYRIR(OTCLPLAAYOVSGEYAMILSAFQQGWLDKETLr 

HESLIAIKRAGADMIIS-rSAPFILELLHOGFEF 

CPn_D745 941903 841742 

No robust horaolog presenc in Cenebank/EKBL as of 11/7/98 
VDSCFDI>n<ASSUX;STTVNVAYDPKHTLAYGFCNOVSVKKFHLKPPKSOEKFL 

CPn_0746 ■ 841939 843567 

CTS32 hypothecicai protein 

rSGRCPFSFr/FMLCKEEErrCKCK0CLSHrVTNLT5DVFALKNLPE\A/KGALFSKYSRS 
VLGLJUIXUCEFLSNEEDCDVCDEAYDFETDVCKAADFYORVLDNFGDDSVCEIjGCAHLA 
ME^WS I LAAKVLEDARIGGS PLEICSTRYVYFDOKVRCEYLYYRDP I UfTSAFKDMFLGTC 
DFLFtTTYSALI PQVRAYFEKLYPKDSKTPASAYATSLRAKVLDCI RCLLPAATLTNLCFF 
GNGRFWCNL I HKUXJHNLAELRRLCDESLTELMKVI PSFVSRAEPHHHHHOAMMOYRRAL 

keolkclaeoatfseemssspsvolvycdpdgiykvaagflfpysnrsltdlidyckkmp 
hedlvoiles^vsarenrrhksprglecvefgrdrladfgayrdlorhrtltoerolt^t 

HHGYNFPVELLDTPMEKSYREAMERANETYNErvOEFPEEAOYMVPMAYNIRWFFHVNAR 
ALW ICELRSOPOCHON\*RTI ATGLVREWKFNPM-/ELFFKFVDYSD IDL^ 

CPn_0747 fl43949 944053 

I.TK3I hypothetical protein 
RTCMGCKCAEVOI LSSRSLSCMKI LS3SLFYKKFC 

':Pn„074fl fl449'*6 944121 

ispA-0«;rjnyl Transtransf erase 

OLNiiDVMD:>ALAV£Fvirr:;TL :aodlpcmdnoderpgrptvhkafdeatalla3yali pa 

AYf IHLRLNAKKLKEOCCDPRE I DI AYNI ICDITDKll IOC3CVLGC0YDDMFFSNRGOEHV 
. It^ALLFCl- K.\.\U:LLARC0NNCLELL0RL:;,\OCLWJCCEFETr tCntjC-'F 

''l-li_(l74'» H-l'iOOi. 
• tlrnn iJOr-.:i»;f(A.: lyr (>ph»)f:n(Jory las,- 

MM TKAAHivV, I* .u^vI^:::^:vNU w;vm;AWFRi.Dr:r-ruwr<:rp:;oKr:KK rc/r^RRKLCAF 
i<;Kf ;VA I'i NwiHH\,iiif i.niTFt ifif>^jvr 



tctO/cpxR-KTH Tt jr:;* :on.»l Re^julator/ rrc:e;n * Receiver 
Doman 

KITDF r LR I HSYWLFCf><W'IflDKni.HVirE0L3L53qtnOUASQR2CX0TM^ 
ESVAIFCEYUXP EO t f SPT* I FPEEDt tVtFUT?^itEXiTKrj9(^MmLn^mXVL 
OAVlRAFLROHE\XE>iS I PCrrMTPCOHTFRVLNLV I ESPECS^.'YLTPSEAGIUCKiilKR 
GHLCUIKNIXAE : KCrn-KS: : AR>AmVH I A3LRKKLCPYCSK UTIRCVCYLFSCDCS I P 
LONHDNTAHPNEE 

;^:v:'; '.':*..'... '' ' v^gy-^v.! 

MFRCILFC I FU.T;:F5oOGVLV Y'ur C JHDK J- IGPKEKSRSVW I EEEKEFTDSVLHHLPSQ 

HOHWIlXFOGFI^KOCKFSOAEKrFSia-YSEAODCPFLFKEEILGSRLINSFrLEICrD 

VMCTII^LLNORCPNS PYYHLFKALVCYKOKLYRE^'I E0l-AYW3E£mAlAPLLNISl E 

OLLTDFtiOYISAHSLIEGKMFPECRVILNRNlrmLUCHECEWJAKTYORIAILLSRSVF 

U2*VESKSADIYFDYYEMVLnfLKXIYILE0CPYA£LLPEEELVSLIM£HVFILPKDKLY 

PLIOtiElfWKKYVHPNSSLWO ILV'DRFSTHMEGAI RFCEALVSFSGLEELHOOt mr 

EELLSNXVQOimEAKOC/ALLH I LOPS IS :SEKIJ^3D7U:nI VSCCTEOHTiaiW 

LDLWEAIOSYDIDRCOLVHHLVVGAKDLWKKGGNrEKAlNLLCLVwRTTSYDIECESW 

tJ'IKOAYKOAI^SHAIARIiKLEKFISEANIPSIVISEAEKANFIJUlAEYlJAHEDYDirc 

YLYSKWLTKVAPSPOSYRI^I^LMENKRYOEALEFLCMlSPrrosrNTJYKIOKAI^^ 

HOSKDRAAS 

CPa.07S2 848595 850082 

"recD-Exodeoxyribonuclease V, Alpha* 

GWALHTEFAPFLEDLVHOQVISPLDIAFASKHISSDFEESFVFLAVSSAtWRYCHPFLSL 
EENRIRPSLOGISmLYRGFHNLPKEARDKLFVWSCRLVLRSLYTIRSKLLOKLSLLC 
SATPWFPPS^DSSII^EEO^^rFNKITOa:FSrVSGGPGTCKTFLAAO:.rLSLVKOOPK 
LRIAIVSPTGKATSHI RO I LMJCYNIFDDMVLMQTVHHFLOEYAYRRVNS IDVtXVDBCSM 
VTFCrXYSL^W>0GYEKD}aaYTSSLrILGIm^0LPPIG:G^*T2^PLCDLICY^HEl^TF 
LKTSHRAKTGVVIXJLTOSVIJWE«ISFSPLPSrSSAIE\aJCNRFVKSLROSEAR^^ 
MRIKPWGVLNUmiIHORLARSDPDLRIPIMVTSRYrrv«LFT«aixn^ 
HEPrOSRALSOniVYNYVMSVHKSOGSEnfDEVIVI I PKGSEVFCVS ILYTAITRAKYl^^ 
WGDPETLHKI ZtCKSNY 

CPru0753 851009 850161 

NO robust hosiolog present in GenebanK/E^fBL as of 11/7/98 
IMXTAHLCROAUJiLilSWTPAIRASGNLFRWSMSUlNhA/U'ACOrVXU 
USSSHYAHAALQKTSGFLGAAIXJVrn'AVAGAHIJjQJLLNGSMI Ft. ' l UU .TU LL RRCNEAD 
ABCQfrOKLORftS ALT ITGKVARLAS KTLGTATFLH DffiWS LGANANX lOCICVrSCLNL 
VATGCSLTESSISLYRILSTRPETISDPENRNKPSAEFAARSKAIRNAFIAWLCOWOLV 
CDALCTLSLFLPAILGVHAVLIMAILGLISCVINFVKDYAKIG 

CPrc07S4 851381 8S1040 

rs20-S20 Ribosonial Procein 

OFIUiLKVLVI^GDIKAPKKPNKKNVIORRPSAEKR I LTAOKRELINHSnCSKVKTIV^ 
FEASUCLDDTOATLSNLOSWSWDKAVKRG IFKDNKAARIKSKATU^^ 

- CPtu0755 951579 952799 

CT616 hypothetical procein 

YmJ-FMlXVRKWUfrCFXYWt YFLPVVTLLLPLVCYPFLS I SQKI YCYFVFTTISSLCW 
FFAIJlRft£NOUaAA\^tIOTCIRKLTDJNBGUlQIRESU^ 

LFKLOCLLVKTTCGECQKIXTLLLHHTEDmCLKMOVDSL I OECCEKTEEVQTLNRCtACT 
LAYOOALNDEYOATFSEORNMLDKRO lYIGKLENKVODLKYE IRNLLOLESDIADIIPSO 
ESNAVraJISLOI^SELKKIAFKAENIEAASSLTASRYUmTTSVHNYSLSCROLTI^^ 
EENLOflJVYAROSORAVFAUALFKTVTCYCAEDFLKnSSDIVrSGCKO*^^ 
CSGRLVIKTKSRGHLPFRYCLMAI/nCCPLCYHVLCVLYPLHKEVLOS 

CPrv_07S6 852889 854676 

rpoO-RNA Polymerase Sigmo-66 

ISYLPLTKI^SKARNPLVLFOVRKUWmjNSOATEVSSEEESOKKLEELVAUUCBOC^ 
TYEEINEILPMSFOrPE0IDaVLIFLTGMDI0\n:N0 1 CJ\^OKEKKKEAKELBGlJU^ 
GTPDDPVRMYLKE>CTVPU.TREEEVEISKRIE3CAOV0rERI ILRFRYSAKEAISIAHYL 
rsaCERFDKIISEXEVEDKTHFLKIiPKLITLUCEEDTYLENm^UCOPOt^KOEAAia 
rmSLEKCRI RT0AYU^CFHCRH^^VTED^CEV^^KAYDSFLHLE(X5I^roUCVIUERNK^^ 
AKIJWOCRKLYKREVAAGRTIXEFKKDVR>njORWMDKSOEAKKaWESH^ lAKICY 
T^^«H^FLDLIOBC^IMGUIKAVEKFEYRRCYKFSTYATVAtfIROAV^RAIATOAR^IRIPV 
HMICTItnCVU«:AKKUmETX:KEPTPEElAEELCLTPDR\mE I YK lAOH PISLOAE^ 
SESSrCDFLEOTAVESPAEATGYSMUCDKMKENajaLTDRERrVLIHRFCLLDCKPJCrLE 
EVGSArN\n'RERIROI£AKALRKMRHPIRSKOLRAFCDLLEEEKTGTSKVKSLKSK 

CPn_0757 854709 955134 

folX-Dihydroneopcerm Aldolase _^ 
PCIKNIALVIAIERYOLIISKFRMWLFLCCSVEERHFKOPVLISVTFSYNEVPSACLSOK 
USDACCYLEVTSL I EE lATmCPYALI EHLANELFDSLVI SFCDKASKIDLEVEKERPPVP 
NLLNPIKFTISKELCPSPVLSA 

CPn_0758 #55104 856459 

folP/dhpS-Dihydropteroace Synthase 

RAMSEPRFVCLSLCSNLGNRFKNLOIARTLLCEOAVLGLRSSVILETEAUXPGSPPEWD 
LPYFNSVLVCETTLCLRELLVT r KOI EKWCRAEES PPWS PRT IDVDI LLYGOESFCCOH 
TElTIPLSNLLSRPFLIALIASLCPYRRFCTOCSPYHNrTPCELAHHLPSPPGMIRRSLS 
POTMLMGWNVTNDCMSDCCMFUJPEKAVAOAEKLFTECAAV I DFG AOATNPKVKQFLSV 
OOEWERLEPVLRLUCETMSNRKOYPI ISLDTFYPEI ZLRAMDIYPIOWINDVSQGSOSMA 
EVARDCELSLV^WHS33LPV^)PKNI LSF3VP rCEOLLSWGEKOLKMFSDVCLNANOVI FO 
PC rCFCKGAAOSLATLYEIAKFKRtjCCP I L rCHSRKSFLSLFGNHDPKDROWETVCLS IL 
LOOOCVDYLRVHNVAAHOKALSVAACEACAP I 

':Pn_0759 ^S64'»4 HS#i9'»7 

£olA-OihydrotoUir.«* RcfJucr a?:*? 

LLVKPVItPCNFENPLrjVEftOKflP/r/P''; r VACDPnrry rr XECKLrVfl r.TEDLorF.';ETroK 
FP I VMGRKTWET[.pPKyFVTRAr/vF.':r(RKRo :vh'';e I wvT!;Lr.EFLU.i>u*:::pTFLr(X 
'^ely:;lflemo (vp.dff i r-u i kkcm' ;t/rFFrt^:LLcnwrKTV( J^^>^'K rTT(.-YYOiitH.i 

t;WiJ>7f-a ;.%7i.i.l 
^.T^ll hypfirn*;r. I'Mi iwr-iu 

RKCPKi^'LE [ PKP:;f;rf \rrMK tTT/KTr-K f yrvuDLY;; i i.E:;r:r,rKt.NKr?" i w trsK tvr* 
r;;MyvWELEKy:;KLrEL.i KOfc:*\LAYvr/f :kY' : i yltkkwt: i l r r::/\i : t ^J^•l•;Mvrt:Y^VLY 
I'hLFLLrnflrru ;Dwr J<NF\ iit.F.ii'/ : i r r ::\/: .uTrH.URt rmr.u :u w .ynyvi :k p 
DChT;RALKKrY:;Ni.Lix;u:AAA7t/ ■M';ir:rta'«'rciA [ I eeapk tTnr:;:;m*U'i)M:rru\ 

IAH(EDLYf;PLLOr;MAWETrAIT:: 
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.TTf^lO nvL^'rn'iC ic.ii prnr.«»tn 

1 1 HT.TW r ELLDKO I EDOHMLKHEFYQRWSECKLEKCOWAYAKDYYLH I KAFPCYLSALH 
ARCDOU? r RRU I LENLMDEHJkCNPNH r DLWROFAI^LCVSEEEIJWHEFSOAAOOMVATr 
R RU:OMrCLAV:j LCALmZ 10 1 FCVC/EKI RaUCEYFGVSARGYAYFTVHOEADI KHAS 
££KEML(7rL%r:Rn:NPDAVUXC0EVl.DrrLWNFL3SFINSTEPCSCK 



E;SLLYFSRFSEKO^^f GLCNHETKRSimaLPDRKKALZAAVAy I EKOrcACj IKSLCRHS 
ATHEIST tKTGALSUSLAJ;^ IKCT^KGRVIEIFGPESSGKTTLATHIVANAOKMOCVAAY 
; OAEHALOPSYASLICVN I DDLM I SQPDCCEDAI^IAELLAJtSGAVDVrvrDSV AALVP K 
3ELECDICCVKVGU5AW«S0AUUaTATLSRS(7rCAVriN01R£KIC7/SrCNPrrTTGG 
RALKrYSSrRLDIRRICSIKCSDNSDrcrWIKVKVWaOCrjVPPFRrAETDIUl/EIS 
"CILDLAVEYNI lEKKCSWrNYQEKKLCWRn^/REXUavnUOJEEIEXRIYDVIAANK 
TPSVKANETPQEVPAOTVEA 

CPn_0763 860520 859972 

ygiA-Fonnylcecrahydrofolace Cycloligase 

ftFPKrDPKrEKSALKKLFISIRRDLSEERKHEASSAVASFVRSFSfCESNArLSrVSFNKEI 
DMOEANR IL rCKCTLALPKIDOEIJLYPVLIPS IDOL ISWHPKDPFSKOTPISSDKITHV 
LVPGLAFDOQCYRLCYGHGFYDRWLAOHPYPS IRTIC ICYCEOK lORLPOESHDIPLSQI 
YLC 

CPn_0764 851819 860524 

CT648 hypochecical procein 

CYKSMDIKKLFCLFLCSSLIAMSPIYGICrCOYEKLTtTGINriDRNGLSETICSKEKLKK 

VTKVDFtAPQPYQIO/MRMYKNKRCDNVSCLTAYHTNCOIKOyiXCLNNRAYG^ 

GNIKIQAEVICG lADLHPSAESCV^lXTrTFAYrroEG ILEAArVYEKCU^SSVYYKTN 

GNIvnCECPYHKCVPOGKFLTYTSSGKLIJCEONYQOGKRHGl^IRYSEDSEEDV^^ 

EGRLUCAEYlJJPOTHEIYATIHBGNGIOAIYGKYAVIETRAFYRGEPYCKVTRITOSCrrQ 

IVaTYinXOC»KHGEErFFYPETCKPKUiW«EGII^IVK^^ 

KSGLLTIYYPEGQIMATEEYDNDLLIKCEYFRPCDRHPYSKIORGCCTAVFFSSACrrTK 

KIPYQDGKPLLN 

CPn_0765 862415 861801 

CT647 hypothec ical procein 

rrmKI^RUfKKWI SILII^Fl^IXSI LPVtAITINHVKI SORWSDLNSOILTL^ 
DHEDOVIXHNARISKDRNNLSIESIJiASCKOLRPl^KERERIJIiaNSNSLLAOSKEWER 
:<RAIJXSNHOLVV^E0MHNDFArVRIX0ATE>tr»IEDIESIJ'SlJWEWPVAPLV^ 
KMTKQTTPLGNEVWLTHAEAISRWI 

CPn_0766 863785 862394 

CT646 hypochetical procein 

AMNHCLPVYHIGLTKAENrrriKIAILQKraCGWIVCHCEOIPEGKTWSLPKKyrAAP'^ 

SLQGSDILVKSSSSSLKNRKNILKVALTIOXASLALPWESLIVOPOLGKPTDRGETPL^ 

WIAOrarrUOCEl^Fl^OAOIFPDKLSCRAADI FFtAEOSPLKSLPAYLLIYCGSEEVTC I 

rVKNHXIAVARSFSNHSTKKSCDDrHATLOYIQETFPQTVLPAIHVAOISPNLOKILBQK 

LSLPLVVCQSKTYGVEDEDWEIYCEyriAAAHHGASRRPLTFPYDATSVSPAAOKHWU^ 

SU.ICKYALMATVWSLCSVU^JCSI^SSAS^mFA^ACPEEX;VLPRSIJCAAEK^^^ 

KNSAStTfPLUT'IPTSEQTUCFUJ^KSSPSIKFSYFSYTtfrSYPSKDNPSUTSAL^ 

WOGQPEDIPOFLKKISSHPKLOHVSESLEDORSFKLOFTLSS 

CPn_0767 863878 864177 

CT645 hypothetical procein 

NIMLSYLtJlTArNVYSFLILAYIFASWVPDCOSARWYOLVSKCVDPFLNFFRRFVPRICF 
IDPSPFVCLLCLGILPFVILRVUIFIILNIFHSPWLLOYL 

CPn,0768 864144 865163 

yohI/nir3 -predicted ox i dor educe as e 

YFSFSKAAPIFIKNILUlSSIVYAPLACFSDYPYRCMSALYOPGLMrCEMVKVEXIILyAP 
ERTSKUJDYNENMRPIGAOIXGSNPETSGEAAKILECLCrDLIDUJCGCPTDKmDCSC 
SGUJaPELIGRIU^KI INSVS I PVTVKIRSGWWEHItA/EDTWI IRDACASAVFW 
TRACX^fHGPSKOEYISRAKAAACKEFPVFG^COIFSPEAAQAMLTTa:lX;VtVARGTLC^ 
PWICKO rOOYLTTCSYEKI PFI KRKAAFLEHMRLVEDYYOSETKFLSETRKLCGHyLISA 
AKVRFLRSSLAKATS'^OEVYOLVNDYEEADDSSLETFVKC 

CPn.0769 967763 865121 ■ 

-opA-X)NA TopDisomerase' r -Fused co SWI Domain 

3 ICX3PHA r R1>1KKSLI IVES PAK IKTLOKLLCSEFVFASS IGH IVDLPAKEFC lOVDHDF 
EPOYOVLPDKOEVINH IRKLAAKCEKVYLSPDPOREGEAIAWH I ANOLPDSPLIQRVSFN 
AITKNAVTEALKHPRTIDMALVNAOOARRLLDRIVCnCISPI LSRIOOORSG ISAGRVOS 
VAUCLWDREKAIDAFVPVEYWNUlVl^OPmKTFWAHLYAVCGKKWEKEIPECKTEN 
DVLL INSEEKARHYAELLEKSSYTITRVEAKAKRRFAPPPF ITSTLOOEASRHFRFSASR 
TM3 1 ACyTLYECVDLDSEDSTCLITYMRTOSVRVDPEALTTVR EY lOQTFGKEYLPEKANV 
YT TKKM TQDAHEAIRPTDINLTPDKUCNKLSDDQFKVYNLrWKRFVASOITPAIYPTIAV 
OITTCTEI DLRASGSLUCFKCFLAVYEEKODDENDOEEOHPLPPLHAODALI KEEVSOEO 
AFTKPLPRFTEASLVKELEKSGIGRPSTYATrMNKIOSREYTTKENORLRPTELGKl ISO 
r LCTr;FPR I MD ICFTALMEOELELI ADNKKPWKLLLOEFWTTFLPWITAEKEAVI PR I L 
TNIECSKCHKCKLVKIWSKNSYFYGCSEYPECDYRTSEEELAFNKEDYAECrrPWDSPCPL 
CGGVMKVRHCRYGTFLGCEKYP ECRCTIS I HKKGEE I EOEEP r PCPA IGCNCKIFKKRSR 
YMK r r/'SCS EYPECSVICNS I OAVrTKYSCTEKI PYKKKTPTKKKSSAKTTKAAKTPSKK 
CKAKSSVKKSS EKKTG PLFLPSPDLAKMICNEPVSRGEATKK rWDY I KEHQLOAPENKKL 
L'/PDNNLAT I ICPNPIDMF0L3KHL30HLTKV3NDESSASS 

•:pn_O770 !?68322 Sti^»13l 

.T''.42 hypothec ic>it procein 

K PftTRNVEKLEFVTSLCS PDDDL ITFNKOC tl ACPEEEKVAFLVRSNAMLDACPETPASF 
^E:;LnE0FO I FPEYVEVLY JNECLDVWEAGfrrW I lEVT rOLRKHHRKASRWLCKYSRD 
tVt /J rEA-/I lAVBHKFH ERVFEEVt^WCTTSRWrTWRRFFO PCFR3 PCESVULLFFT r LCLC r 
::t.>f/PAC II. tMUVLI -MYFtWRLCMAOSYLVRAMKK IPKMUGVPPLWVLLRLTOKEIKMFA 
y Kf • t PVLEl (YAPK RKLF>JVr<WKO t YO^V FV 

it,jyrn k/jimi <i.ti4i 

ttx^l UtlA f'olyffli-r.i:;.? ::t«fm.i-'j-t 

t KK'tHriKP l.YU :: :ALOMtV«:>KfjKU;i.KY LPr;Ll*>JO(/:LOM WGPLTEUJ Sr/-/OE 1 1 ON P 

KKM^*;:;c j:Li:r>f: :r< Yp rpN: TFriY L^«^•^< ;rvEi; LYTR LLP^ 

-V ;f ir.*:i)ry JLIH-l'MI-KDrAOELELPLEK r MKVWr^tOf ILCr-EG t ACPSWS'^KU-RNSS 
I M.V»Y: W VPnCYPtKTr* ;ErAP [MKKF^'L::u:i:UttI tUCKAUIC r PWCPAAACTVKPMVS 

( I f .M . I (tyr.irj wk r icv:rrnGi.r;; iKi.nKFTn(FYaiLPKEE0KNL500i lsakwlik 



NLRKREOrtXOVMETLLPKC -U:KtPAPYrLA::KCLAECCJFnE.Tr:rPAI£NKAVA 
AP tC r FPLKHLFPRC I H002.':HSKE^^VLCW I RCW r ATECTFL3C.r/ t3DR :TAKG r PTAR 
RTVAKYRAOU(aPAWPraLF-f.tfC|[hCiRFRDR^^ 

CPn.0772 «i72400 e'0A6^ 

uvrD-DNA Helicdse 

KLCLXHTC I3EU/EACRKAVTAPC^PVLVt-\CACACKTRV\TYR : LHLZNCX: lAPREILA 
VrmnCAARELKEP :^"N0CA5TNEFOVPMVCTFHSl^FILRRS imj^iREW 

;KA:;A:.pr:>: r. ^:J-:•:/:■K.^::;:.v::. : vvv ■r^,,,;\^Y-.v^^:^^ 
^A^TA\^\;DPCOSX VJWft<;AN IHN lUfFENOY^NAXVUJLEEN-rhi'YCN ILNAANALIKNNA 
SRUXEUISVXC PCEK I RLFU:STOR EEADFVAAEI LOUiRVXSIIKLRDICIFYRTNSOS 
RTFEDAUJIRR r PYE: XGCt^FYKRKEIOOI LAFLRIFtSKSOIVAFDRTVNLPKRGICS 
TTIFALWAIAOCLPIUCACCX5AimKDVKI^KKOOECU:EYlJU.rP0IEHAYfCLSLR 
DFIESVVRITCYUXUCEDADTFKDRKSWLEELYHKALESEOONPKTHLELnJDU^ 
SDDDLNLTADRVrrtXryOWKGLEFRVSFLVCLEEOliPHANSLOCTYENIEEERRL^ 
CITRAODU.YLTAAOVRSLWGTVR«MKPSRFU<EIPKOYMIQfVR 

CPrx_0773 872485 873195 

ung-Oracil DNA Clycosylase 

FM0NATID0LPVSK;ECt.PLC>reE0EJCEEWSKPYM00U-IFLK0EYKEHrWPEOI^ 
ALRSTPFDOVRWIUA:OPYPGKGOAHCt^FSVPE)CORLPPSLINIFRELKTDtJCIENHK 
0ClJ0SWA^O;IUX^m/LTVRAGEPFSHAGKGWEL^TDAIVTKLI0ERTHrIFVUCAAA 
RKKCEUJTiSKHOHAVt^SPHPSPtAAHRCFFCCSHFSKIWLLNiaNKPMINWKLP 

CPn_0774 873183 873425 

CT606.1 hypothetical protein 

LEAPKNBC IHSVCPOKTPRLTAKSWSMEMIXTTQOLPS AECMPSVANLEAOFLRAEALL 
AEMREIRGCLCOSLRTLVPSC 

CPn_0775 874040 873414 

yggV family 

ERniKIVIASSHCYKI RETKTFlJCRLmFDIFSI^DFPDYKLPOEQCM ITAiWLTKC 
AANHLCCWIADOTMIJWPAUWLPGPL^ANFACWGAYDKDHRKKL^^ 
AYraCCVVLVSPWEIFinTfCICESnflSHOEKCSSCrCYDPin^KYDYKOTrAELSEI^ 
NQVSHRAKALQICLAPHLOSLFEKHLLTHD 

CPn.0776 874180 875487 

CT605 hypothetical protein 

FIFVOJCNFYDCLLMFFOFI^rTMKKI FYSFVU^C IFPYVGCACTVrV^ 

CIOGiaCIALISHSAAINSRGOnALS\^SRKHDCTVEII^LEHGYYGATPTETVCNOPS 

RYPtILRSVSLYCVKE\mC£VAEHCDVr\mM)DICVRSYSFVTVli!Or^ 

VUmPNPHSCRIVrx;PU»NPTrSCSIAIPYCYG^frPGELALFFKK^YAPNA^Aft^ 

WNRSKrrDErCLIWMPTSPOMPDPOSPFFYAATGILGAt^AStCWrrLPFIC^^ 

DCBCVADEI^IRKiaJCVtJLPFFYEPFrGKYKMEMCSCVLLV^ 

VUCALYPKQVEQTUCSIERIPARRSStCNLFCCOETLSISKKERYIWP 

FHOLRSSCLLSEYAES 

CPa-0777 875586 877178 

groEL_2-heac shock protein-60 

TSEDKSAWFXSOFBGLSALKRCWALTKAVTPAFGPRGVNWIKK^ 

AKSIILQQAFESUJVKIJJCEAIXKWEQTCOGSTTALWIOAIXT^^ 

KAGIU.SVEim(X)L0RQAIELOSPKDVrtJ<VAMVAANHDmXnWAW 

SKDSCISKTTlGUaCRVKSCYl^PYrVTRPETMDVVWEEALV^ 

LISEOfmfPLVIIA£DFCONVIJm.ILrOCUlNGLPVCAVKAPGSREIJ^^ 

ATLIC0ESENCE1PVSU)VI£RVKQVMITKETFTFLECCCDAEII0ARK0ELCLAIARST 

SESBCOELEERlAIFICSrPQV0ITADTOTEQRERO^OLE3AUUT10U^MKCGIV^^ 

AFXJUUWAIEVPANl^SGWrFCFETLLOAVRTPUCVLAONCCRSSEEVIHTILSHENPRF 

GYNCKTCnTEDLVnAC ICDPL IVTTSSUCCAVSVSCUiTSSFFISSRTKT 

CPJU0778 877400 878092 

csa/ahpC-Thio-5peci£ic Antioxidant (TSA» Peroxidase 
APVAOSDRVPGYEPGGORFESSLVRNNKRVEEEWMTLSLVCKEAPDFVAOAVVNOETCT 
N^LWTYUIKYVVLFrYPKDFTYVCPTELHAFOnALGEFHTRCAEVIGCSVDOIATHOQWL 
ATKKKOGGI EG ITYPLI^DEDKVISRSYHVlJ<PEEEI^FRGVTLIDICOC 1 1 RHLVVNDLP 
LGRS IEEEIJm.nALI FFETTJGLVCPANWHBGERAMAPNEECLON^FCTID 

CPn.0779 878502 878095 

CT602 hypothetical protein 

RFDLrFOMKrrVALFGEAEKCSYITrAYFCRSLVDLHNYLCDVSSPClTLAIIcaLSDYNV 
VYFRVREEGYCVDSYFFCLHFLNTQTTLKNI lAICLPCVGNQHI lEASRSLCQKHNSLLL 
FFOHDLYDLLTFNOPF 

CPn-07eO 879241 878591 

papQ/amiB-N-Acecylmuramoyl-L-Ala Aniidase 

HCNK t AVOSLRFMHAKLSFF ILLSLLFSG IDCSPiHAACRSPSLOCVUVEI EDISAKLAS 
HE\rt:iVMUIERLDEODSKCOKWTAAK PCTl^KI RELESCOKALAKTWVLTTSVKDLQT 
NLOSKLOE lOKDHRALAODLRLVRRSLLALVDSESPGAY AOFSDPVPEN I Y I VRBGDSLS 
KIAKKYKLSVTELKKtNKLDSOAIYACORLCLOP-NKQ 

CPn_078l 879851 87919? 

pa I *Pepc tdoglycjn-Associaced tipoprocein 

OrO'RSRRKTVPLIX^CFPSATDKENTWI I HSt>nXCTLLALLALPACSL5PW 

TCHHTRRKKPSSFCFVPl.YTEEOFNPNFTFCr/DSKEEKOYKSSOVAAFRNITFATDSYT 

t KCEOIUVI LTNLVHYMKKNPKATLY lECHTOERGAASYriLALGARRANAI KEHLRKQC t 

SADRLSTISYCKEHPLNSCHNELAWOONRRTEFK.IHAR 

t:Pn_0782 (131077 979771 

rt^lZ prjiytuicchu iile transporcer 

» :r) r'iMLROU'FOVFFFvTArlLVYAEELry/VR'EH ITLP E EV:;rOTDTKDrK tQtrfUZr.L 
TE 1 P'/KD I AlA:Dt:i^?rTAA::KESS*;Pt^ISLRLHVP0L3r/LX^r;r:KTPCTtJt:::hT ir-ON 

L::vDi'Vr I HU/v\rTp\i tVALTG [ PC 1 3 Ar;K I vFALSSLGr coKUcynKLwrrPYu :kni 
t;rTir'r:i iTrKwv.:v\::*^FrYLYV3YK-WPK iFLCSLaiTBnKKVLrucc :NorniTKS 
ni< KKI. I JVhVAi rrvi :Nri 'i.f i oi'f::i.t.t;pmcr PPntAMEitFi7TQrMP::FNrFiV.uji,yrtr> 
NKi/;i*i>i.Y I m::i .im'ki vAfm.LTKKYr'N:;scPAW.';pDr;KK iAr(:::v rw wtmr i -mu: 
:u :i-:i#Yoi.TT::iTNKr::i : WA I n::nnr.VF::AiWA£ESELYLt 3LVTKKTNK I A t( ;vf ;kkrk 
r*::wf;AKrv.MMKi<Ti. 

i.T'i'*': (tyt-tfhi'r titnr.>tii 

I MMr/Li 'Y t A i-PA* • i 1 1. : u .uuvi 'A; : ri .f'KKff t//pKAFOEKtvT ror-KPi VI 'ri •: :vvvi jp 
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AKTtRPrrv*AT0P0K0AKC3PP0ErJVQKAU)KP [PK ..TEPPKPSPAPTVAKKTTATEKP 
PP5rnKKrn^t-^KTQUrLJEVAOAL3Un/OK ra<3CTjLKN ISWPSTAOLTMHSEL^ 
'JEDELCELFRTI I r ALPSKr;*m r KLVLJ PNCE r0EC3FL5EVSAADK0LLTCR lOALPFO 
KFLEKYKV.'^KN r.:FH I KCA-ZNEC 

CPn_07n4 ^8235') 9^1802 

'•xbD-Biopolymor TMnsporc Procem 

nRAnr:TF--Fr.'Pr.OrPOYKt>*fr,^PrrEEIEE£?L^'NLTPL:D['/T^/tLKAFlVAVPLrK 

...\':A:..\i ■ v:..:.:;;::l.: :av; :";;-v,i.rr ; :-:,^:;:.r/":.T:,:.:tr*Av: 

: rLi.wii;;v i'r-'.riVKMAii-Vv :i if:'U(VA;.-,r: 

CPn_0795 99303? 982296 

exbB/colO-polysdccharide cransporcer 

DHLYFETTLSVNKOFYSMVH FSHNP I IQAYTEADFFCKSIFrCL:*IUSVavn/Ii<OKrJVI 
OKNFUCACKSIJCDFLIK^mHAPLSLDIHPELSP^ADLYFTIKi^CTL£LLDKNROSAPDRC 
P I LSSEDIQSLETLLC AIMPKYKALLHKNSFI PATT ISLAPFLCLLCTWC I LVAFTHIS 
SCSSCWSAIKEGIATAIOTIICLFVAIPSLIAFWUCAHSSELISEIEQTAYLLLNSIE 

VKYRNTNL 

CPn_0786 883137 88S293 

dsbD/xprA-Thio: disulfide Interchange Protein 

NHGVIU^FmLOTALIAPFFSFPALSGSFSS lOAEEITOOVNH PGAEU^ECSYIPGL 

QTFRLCIKITASKGSHIVWKNPCEIGSPLKISWOLPKGFWEEEHWPTPKVFEEEGTTFF 

CYEDSALIVADVRAPEGYTPGQEVElJLAQVEWLACCDSCLPC>n/DLKLTLPYEEKEPSL^ 

PDTKAEFTKTUiAOPRVLENDHSVaVAOGKGNEIILNISKKINATKAVffVSEX^ 

AETSYSCGTCTAWRLKVKNLSGVOKNEaO^GIUIADHTGRPVESLTIHSEVl/^^ 

ACLSQYITILIMAFLCGVtXNIHPCVLPLVTLKVYGLIKSAGEHRSSVIANGLWFTU^ 

GCFVCLAGVAFIUCVLGHNrGWGFOLQEPMFVATLI XVFFLFALSSLGLrEMGTOFANU; 

GKLOSSEMKSSNNKAVGAFFNGrLATLVrrPCTGPFLCSVLGLVMSLSFLOOLLIFTAIG 

aWASPYLVFS\^PKMLSVLPKPGGWMSTFKOLTGFMrXVTVTWLVWIFCSETSTTSV^ 

UXX5lJWLA5LCy^WIU:RVCTPVSPKK0RVCASLIJ-FAFU^ 

SVNEDSUJQPFSLEKlJWUlAOGRPVFVNFTAKVCLTCQKrWPVL 

TLEArwniKDPCIT£EIJUU/:RASVPSVVyYPGrWSAPWLPEXITONU£^ 

CPn_0797 88S604 886401 

yabD/ycfH-PHP super family (urease/pyrimidinase) hydrolase 
. TRROPVDLAOAHVHLSDOAFEEDINSVLQRAODSGVSLVVNVTn'EKELNRSFAyAERFP 
KIRrCHVGGTPPODVDQDIE£DYHNFHAAAHSKKIAArGS\A;LDYCFATEBGIARQK^ 
QRYIJ^LECELPLVVHCRGAFiroFniMUX3Ylf»JDPRSRPGMLHCFTCTt£EAQE2.IS 
GWriSrSGrVTFKNA0DUU3LVVHXPLEHIXIETnAPFlAWPyKGKXNEPAHV^ 
VANVKCMFPOELAALAYKNVLRFLHG 

CPn_07aa 886521 887432 

sdhC-Succinate Dehydrogenase 

SLVKSLRHSRHEICPEVSHKKGKYYSTFIFRCIHSLAGIAFTFFLCEHLFTNMLASSYFS 
QGKCFVA^W^CFHKrPGUCIIEVAGLVLPFIrKAIIGmI^QCKSNCYSGDGSRPKU^Y 
AKNYSYTWRV/rAWrLLTGIAFHVVHUlFIRYPVHNmiHGTTYYAVDIOPSRYDVIV^ 
KGFLTLWLPNTEASSIEVSRHDLGGADAAU-SERNSYIXTPSAGTAFLVVVRnALGSl^ 
• AIXYTILVlAAAFHGFNGLWTFCCRWCWWSUlMQGVUlIVCnAMIWrFt^ 
YSVX 

CPn.0789 887436 889316 

sdhA-Succxnate Dehydrogenase 

QMDaiRlCVIWCGCLACLSAAMOLANLCI IVELVSLTKVKRSHSVCAOGGINAALNLKPE 

EEDSPyVHAYirriKCGDFIADQPPVLEJCLAAPRIIKKLDNFGCPFNRGPSGNLDV^^ 

CTLYHRTVFCGASTGQOLMYTl^EOVRRREHAGRVIKRE>mEFVRLVTDHSC 

?n>FNNRIXIIitGDAVIIATGGPGVIFKMSTNST F CT O AAWGRLFLQGMAYANPEFIQIH^ 

TAIPCRDKUlLISESVRGEaiRVWVPGOSSKRrVFPDGSERPCGETCAPWFLECMYPAY 

G^rt,VSRDVGARAILRVCEAGLGIDCRMEAYIXVTHLPEICTRHKI£VVL^ 

T^mmiFPAVKYSMGGAWVlWAADDPDRDSRFROMTNr PGCFNCGESDFOYHGANRLCA 

NSLLSCLFAGLVSGDEASRF I EAFGASOATSSDFDRALOOEKEENARLLSASGKEN I FVL 

HEEIAKIMVRN\mnCRNNRDU3ETKDKUCErRERIJCWSVLDSSPFANKSFHFVR0 

LELAIAITKGAUJlNEFRGSHYKPEFPERDDEHWLKTIVAWAPEEPEISYLPViyrRW 

PTLRDrncSSTCKI ELTWIPDNIRLPI 

CPn.0790 889279 890103 

sdhB-Succinace Dehydrogenase 

MSRIFL I ISVYPYRKR E^mEI^LCT^ILKI YRGVPGKCJYWESFELPLHPCENV I SALMEIE 
KRPVNIlX;EKVNPVWEOGCLEDrcGSCSILV^C^^ROACTALIOEYIDATOSREIVL-AP 
LTKFPLIRDLIVDRS IMFDNLERIOGWVAADI ECETFGPOVTOEOOELLYALSOCWTCGC 
CTEACPO IDNKSDFIGPAAISOARYFNTYPGDKRSKKRWRALMCKCG I EGCGOAHNCVRV 
C PKKLPLTES ISAVGREISKFSLRSLFSALFKiCKK 

C:Pn_079l 893104 890111 

CT590 hypothetical protein 

TCLRSSRKI WEDISDRNMYSCYSKG ISHNYLtHPMSRLDI FVFDSLXANODONLLEEI F 
CSEITn^KAYRrrAUJSPLAAKNLNIARKVANY ILADNGEI DTVKLVEAIHHLSOCTY P 
ijGPHRHNEAODREHU-KMLKALKENPKLKES I KTLPypSYST lONLIRHTLALNPOTILS 
TIHVROAALTALFTYLRCOVCSCFATAPAILIHOErYPERFLKDLNOLISSCKLSRIVNOR 
E I AVP UJL^GC IGELFKPLRILDLYPDPLVKLSSSPGLKKAFSAAKLIETU3DSEAOI0Q 
LXSHOYtWKWNVHETLTANO I IKSTLLHYYQLQESTVRA I FFKEGLFSKEQVAFSTOH 
PR ELSEXORVYHYLHAYEEAKSAF IHDTONPLLKAWEYT WTLADASOPT rSNH I RLALC 
WKCEDPH3LVSLVTHFVEEEVENIR I LVQOCECyrYHEARSOLEY lECRMRNPLNNODSO I 
LTMDHMRFP.OEUJKALYEWDSAOEKAKKFLHLPEFLLSFYTKO IPLYFRSSYDAF rOEFA 
HLYANAPACFRI LFTHCRTHPNTWSP I YS INEFIRFLSEFFTSTES ELLCKHAV INCEKE 
raRLVWH ITAMLJrTDVFOEALLTR t LEAYOLPVPPS r LNHLOOLSOTPWVYVSCCTVDTL 
LLDVFEnCEPLTLTEKHPENPHELAAFYADALKDLFTC I KSYLEECSH3 LLSSSPTHVFS 
I : ACSPLFP, EAWDNDWYSVTWLRDVWVK0H0DFU3CT ILPQLS lYAF I ENFCNKYALOHV 
VUDFHDFCSDHSLTLPELYOKGSRFLSSLFTKOKTVALI Y [ RRLLYLMVREVPYVSEOOU 
i'r/t.DNV.'::;YLGI.'J::RITYEKrRSLI EET IPKHTLL^GADLRH I ykollmosyok iytee 
l/TYLPX'n'AMnMI [NL/\Y ^APLLFADf;^MPl; I YFCF r UIPGTTEI DLWKFrr^ ACUCOPLO 
ri I OELFA-rarWTI .YANP I DYf WP PPPCYR5RLPKEFF 

•:i7i_t)7';:: s^^-.tss tj>JiOH 

'rTVH't hvfrfu htjr, protein 

f-rHHI. lU I K': r n IHKltTFTKnVLFFFFLV t P t PLLLI (Ii1VVr:FF::|-::AAKANLV0VUrrRA 
Tf ir^-: t EFfy KLT t ltKLFLDRLANTLALKSYAaP:;AEPYAOAYN[WMAL::NTDF::LCLtDP 
Kf/^rTVR-rKrjrrOPFI RYLKOHPEMKKKt^SAAVf IKAFr.LT r (<;KPLUIYL t LVEOVA.WDS 
TTT:y:Lt.V;:i -Y PK: :I- LOKDLFOSU I ITKON ICLVNK Y'JEVLFt:AOO:'E::-';FVFGLDLPNL 

i-ofoar:; [ :a i e t eka:'vO t ux;enl i tvs inkkry inujim t r u\ :rYTL::Lvpv3DL t 

V:;ALKVrt tCFFYVt JVF(.t>[WWt FUK rNTKLNKPLOF.tTF':HI-:AAWRt ;^«I^AmFEPOPY 



f7YEFNEtx;ri tFNCTLLLUj;- AC rcYttr^iLKi^K : : -.;al^:f=fptf^^^ 
vrjLOK isKEyrADGryTTErafEAWAyfTgirr/EKOitrxg^^^- ^ 

RLPLETHOALOPCDRLrrLTT^CD I CKYFCCLF I EElLKCPUlPtrTOtL TCSLTWOWN 
ETEHSADCTLTIL^r.'; 

CPn_07?3 a'i.;a3H ^.Mil' 

rosU-siini.i rertt ir-ir^ f.ini Iv nr«c.»wi-?F:c cnospnat-ise (RsbW 

u.' ('».•:. . • . 

.:av:i m:: -rr- ' ■*•'*:■' • at.t.' :.x:i:i\ArvA 

OTLTOIVPLfWDVLGIJ'^DVl^LOA^ ; r !rr ^^^,•U^NE^^>KVFX : YNE I SLlKVFPfC 
KI WASS r PDlLGErr^NHK I DI PK^r^PFLAAiJ^OSPKN0E^^S•»/W0AW^)AI^ 
LYTTFSAESUJCDLLirrKOS'aTVKrAILSKYCVILKASDPAlJiUnWPCKrKEKPCQV 
FU^DPCP IDSELCPLT:^ PUJIGENPfSFKIKCTE IVJCCI ENVPS ID! AVLSYAKKEES 
FAPLWRWUIKYTAYFFC ItXCSLIAF IVARRI^LP: RKLATAHI ESRKNXNCLrrDDSW 
FEINIUiCHIFrWMVENU«00HLAiCTNFEMKOW0NALHLGE0ACX)Rt^^ 

SSSLQQAIOETSRIJTNrfrKNSG«FVTLC'-/YCyHCTSKrMEYYSCGHPPACYLDPDGETS 
WLFHPGMALGFLPEVANITSKLFHPKPGSLFVLYSDCITEAKNNNNDMFCEERWAAIOG 
LTXnCSAAnAVHiU>ILSVKrrVCMSHQHDOITU.IUCVl£S 

CPn_0794 897123 899004 

No robust homo log present in Genebank/EMBL as of II/7/99 

KSSKHRSFIXKKSOG^K3VSLY0KWWNS0LKKSLCYSTVAAL:FMIPS0£SFADSLZDL^^L 

Gr^PSVKXSGCGAFSVGY^TKACSTP^^OPFKYDVSKKT^^rLSVF^AWSCYAYG:S 

YDGTITVGTCSUWGKYWAKWSADCTLTPLTGtTCGTSHTEARAISKtW 

SG0PKAV0WASGAnvrQL^DISGGSRSSYAYAI33DCr : IVGSMESTITRKTTAVKWVN 

NVmUrrUXKJASTGLYISGDGWVTO^Arn-ATVmia^ 

CPn_0795 898008 899195 

No robust honolog present in Genebank/EMBL as of 11/7/98 

CrmSCytffSSATGVSSDGSVIVCOAOTADXSVHArOYYNCEMKDUrrt^ 

GICVmCRSOIArGSWHAFKCHTDFSSN>n/LFDLI»nTKrUl£NGROI^ IFNLQNWJLOR 

ASDHEFTEFGRSNIALCyCLYVNALONLPSNLAAOYFGIAYKIRPKYRLCV^^ 

VPtWFTArSHNRWWGAFIGWOOSnALCSSVKVSFCYCKOKATITREOL^ 

EXWAAQIECRYGKSLCGHVRVOPFLCLOFVHITRKEYTENAVOFPVKYDPID^ 

CrCSHrALVDSLHVGTRhOffiONrAAHTDRFSGSrASICNr^/FEKlXIVW 

YELPYWSSLNLILRVNQQPLOGVMGFSSDLRVALCF 

CPru079fi 999280 901340 

No robust homolog present in Genebank/EMBL as of 11/7/98 

SELVSSYWPCLNMSIVRNSALPLPCt^RSETFKKVRSHMKFKKVLTPWrrRXDtJ^ 

LLTAIPGSFAHTLVDtACEPRHAAOATGVSCDGKIVrCMKVPDDPFAnVCFOYIDCHLO 

PL£AVRPQCSVYPNGITPOGTVIVCTNYAIG>CSVAVKWVNGmELPMLPDT^^ 

VSADC3lVIOCai!WINLC»SVAVKWEDDVIT0LPSLPDAMNACVNGISSDCSIIVD^^ 

SWROTAVQWIGDQLWICTLOCTTSVASAISTDGWIVtBSElUDSOTHAyAVKNCWM 

IGTtaSrfSLAHAVSSDGSVIVGVSTNSEHRYHAFOYAIXXXIVDUTOKPESYA^ 

DGKVm^RAtTVPSCDWKAFLCPFOAPSPAPVHOGSTWSQNPRGMVDIMATYSSLKWSO 

QQLORLLIOKSAKVESVSSGAPSFTSVKGAISKOSPAVQNDVOKCTFLSYRSOVJa)^ 

<X}U.TCAnfCWKIASAPKCGncVAUrrGSODALV^^ 

RYDFNLGETVVMPFMGlQVUItSREGYSEKNVRFTVSYOSVAYSAATSFMC^ 

PKMSTAATLCVOTLNSHIOEFKCSVSAMGNFVt^STVSVUlPFASlAHmW^^ 

TLSWMNQQPLTCTLSLVSOSSYNLSF 

CPIV.0797 901552 902694 

No robust homolog present in Genebank/EMBL as of Ll/7/99 
VLILTVINVLTKLCWWSKKIKVLCHLTI/:TLnu:\ax:A^ 
EDWKCYTnt)IXLl-SKECMSEAHAVSCNGSRlVGASGAC(XSVTAVIWESK^ 
GEASSAEGlSKIX:E\AA/GWSI7rRBGVTHAFVr DGROKKDUTTLCATYSVARCVS^ I 
VGVSATARGEDYCWOVGVKMEKGKIKOLKLLPQGLWSEANAISEDff 
VAVKWNKNAVYSLGTljCGSVASAEAISANGKVIVCWSTTNNGErrHAfMHKDET^ 
OGGf^ATCVSADCRArVCFSAViaCEIHAFYYABGEMEDLTTLCCEEARVrOrSSBCaro 
I ICSIKTDAGAERAYLFHIHK 

CPn_0798 902810 903856 

No robust homolog present in Genebank/EMBL as of 11/7/98 

WFEI irWUVPMKKTCCQNYRS rCWFSWLFVLTTOTLFAGHFI DICTSGLYSWARCV 

SGDCRVWGYECCNAFICYVDCEKFLLECLVPRSEALVrKASYDGSVIIGISDODPSCRAV 

KWVNCALVDLGIFSErWOSFAEGVSSOGKTIVGCLYSDOTETNFAVKWETXaiWLPN^ 

EDRHSCAWDASEDGSV IVGDAMGSEEIAKAVYWKDGEOHLLSNI PGAKRSSAHAVSKCGS 

FXVCEFrSEENEVHAFVYHNCVIKDIGTLGGDYSVATCVSRDGKVIVGHSTRTDCEYRAF 

KYVDCRMIDLGTLOGSASFAFGVSDDGKTIVCKFETELCECHAFXYLDD 

CPn_0799 905001 903940 

No robust homolog present m Genebank/EHBL as of 11/7/98 
KREENMAAl KOI UlSMt^OS3LVMVLFSLYSL£GYC3Cy ITOKPEDDFHSSSAVKWDHWGK 
TTLSRt^NKKASAKAVSGTCATT^/CFIKDTWSP.TYAVRWNYWCTKELPTSSWVKKSICATG 
ISS0C5IIACIVENEL30SFA\riVKNNEMYLLPSTWAV0SKAYCISSDCSVIVCSAICDAW 
SRTFAVKWTCHEAOVLPVCWAVKSVANSVSANGS I IVCS'/gOASG I LYAVKWBCNTITHL 
GTLCCYSArAKAVSMNCKVIVCRSETYYGEVHAFCHKNrr/MSDUrriJOGSYSAAKCVSAT 
CKV tVCHSTTANCKLHAFKY\TGGRM lOLCEYSWKEACAHAVS I OCEI IVGVOSE 

CPn_09no 906550 90524'i 

eno-Eno Use 

RKErKIMFEAVIAOIOAREILDSRG-^PTLHVK'/TTGTG.r/^EARVPSGASTGKKEALErR 
OTDCPRYOCKCVLOAVKNVKEILFFLVKCCSV/EOSLILriLMMDSDCnrNKETLCANAIL 
^;V:;LATAi lAAAATLRRPLYRYUXCFACSLPCPMMNL r : K/;MHADNnLEFOEFM IRPIGA 
S.-; tKEAWMt^ADVFHTUCKLU<ERCLSTCVCDEr/;FAPr rLASNEEALELUiAI EKACFT 
PGKD i:;U\LDi:AA:;::n-NVKTGr/OGP.HYEBC tAIL.':MU'.DRYr lOr: tEtX^LAEEDYDCW 
ALLTE'/U :eKVO r V» TPIH.FVTNPELtLEG IZtrAJSU'.m. : Kl'N^' tt rpI.TCTVYA tKLAOH 

Afrrrr 1 1 : -i tn.tctTTrTT i adlavafnago i r.TZZLCi'::rPVr\K\miME i eeeuwea i 
FTD::fr/K.:Yl<D;:Kr-: 

'.Tn_'»Hnt •taato') •n)r.7i7 

•IV lit Kxitiiit-lf.i:.* Au* I'Aitjurm H 

1 1 rrwrP.^LHAt'KAiw :i\^vkA tAPLr;At*-vRNO'/Kr:o'/r.vrrh ://;K*riT( aiiwanvhi. 

f I'l.VI JVI(NKTLAA(,»I A OKKIIKKFr-f (IIAVEYF 1 rrr^DYYOi'fC/W t Al^mrrV I KK: :t.t. f NDE 

I DKt j'I-*;atk: ; 1 1 .immi ti. i v: :: :wr.t: t yo icr: t-ENYTav»Lvi .tv :r t r.TKN 1 1 .T/Vjlvk 
mhyoa;:p t fvn:;AKKKii. ;;:v 1 1 1 1 Kl'AYFj:KUvU<LEFUlt/rur:: ti:Y::niM.TH i I'KE::vp 
;:ATi,Yrf :iiYV i i ui-va i ur tf/Kf ii.KRftMAFFDOP.t' I EKDK I Fiiirmir* t hM t KK 
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vrr;FPLi*::AFONRPLTYEEAOKYFnr/tYv:;Ar. .-evoesschevoo: rnprcrpop 

M f'E I H fat: :0'-'DDLLEE I RLP.LJOKH EK r LV: ITKKLAECMACFt^ELt I PAAYUHSG I 
ETAERTC [LTfJLR.W tO'/LrCVTrfLLRBCLOLPEVSLVAILDAOKECFLRSTSSLIOFCC 
RAARN [ NCKV t FYADOKTRS I EETLRETERRRO r CLDYNKEHN rVPKP t r KA r FANPI LC 
T3K0SEO PK E:;Cft PL3KEDLES0 r KK-^EALMORAAKEFRFWEAAKYRDAMOACKEQliVL 
F 

: ." vr:r • r •. • *::-■•. Ti- r i-^-; 

" ': \rvy.wKKK:'-. i ; r :•'LU^/;l^.^^v^;;;::-^^*"^uLv^^'*^I■r^:^''■■' : :.'.rLr!T:.. 

£ RKEEVLOVtrm r Vr/ WJ>a-3VC IDPTKS 1 1 YLOSA r PE r VEWLt^SML :S INR\^ 
PSUaWARNASIEECSI^CLICYPrWSADIUAKAOFVPVCKDNEAH^ 
NRLYCOVFPEPEVl^ELTSL*;C ^DCOCKMSKSA^mIYl^OSDAT ITEKVRKKYTDPNR 
I RATTPGR\m:NPLFI YHD I FNPHKDEVEEFKARYROCXrrKDIEVKARUlEELIHrUCP I 
KERRSEFLSKPLAW^m£DCm^KMREVAKVn1EEVHDKFGFSHKWRSLLK 

CPn_0803 910306 909752 

CT584 hypothecicdl pracein 

LINCKLTOErJ^POOKOAAHSLIAEFKMPIRVAKDIHERGEFINriTSCMLTOOERCIFLN 

RLARVIXXJEFIXOTDVOOTCHLIRHLUyUXEAOKNPVCElOajOEIQS 

TKALO 

CPn_O904 911074 910310 

gp60-CHLTR Plasmid Par^log 

E r FSSfOJLKTLLESRFTCKNTPTKKEAIAWCRMEGDPSPLAVRI^ 

UJHYNFREO I EEPOLTQITTISAEVKO IHHOSVUJiGERrTKVRDLLKSYREGAFSSWLL 

LTYGrmOTPYNFLVYYEUTIiPEPUCIEHEKMPROAVmASRCXIPOEKKEEI IRNYRC 

ERKSELLDRIRKEFPLVETOniKTSPVKOAIJVMLTKGSOILTKCTSLSSDEQIILEKLIK 

KLEKVKSNLFPDTKV 

CPn.0805 911846 911067 

minO-chromoBCOTe partitioning ATPase-CHLTR plasmid protein GP5D 
CYASRMKTIAVNSFXGCrrAKTSTrLHLGAAIAOyHQARVI^IDFDAOAriLTSGLCUJP 
YDSIAVVLOGEKEIOE\^RPIODTOLDl.IPAiyrWI£RIEVSGNLAADRySHERLKy^ 
VQDKlfDWI IDTPPSLCWLTESALIAADYALICATPEFYSVKGLERLAGrrOGISARHPL 
TILGVALSFVWCRGKNNSAFAELIHKTFPGKLUmCIRRDITVSEAAIHGKPVFATSPSA 
RASEDYTNLTKELLI LLRDI 

CPn,0806 913816 911B67 

thrS-Threonyl tRNA Synthetase 

MAJWESPPrMEAWNKMIOVTCDOKNVEVLEGTTAAEIJUCOLKNSHOFIGVLINER 

THUTCDTLVFLTSEDPBCREIFLHTSAHIXAOAVLRLWPDAIPTIGPVIDHGFYYOFAN 

LSISESDFPLIEim^OIVOEKLArSRFTYGDKQOALAQFPONPFKrELIRELPEIiEEIS 

AYSOGEFFDUrRGPHLPSTAHTKAnCVLRTSAAYWRCDPSRESLVRIYCTSFPTSKEUlA 

HI^IEEA10CROHRVLaAXLX}I^S(X}E5SPGMPFFHPRGMIVWOALIRYVrKOUrrAAGYX 

EILTPOUlNROLWEVSGHWD^nrlCAN^^m^IDDEDYAIKPMNCPGCMLYYK^^ 

PLRVAEVGHVHROEASGAI^tJlRVRAFHODDAHVrLTPEOVEEETUJILOLVSTLYGTF 

GIXlfHLELSTRPEKDTIGDDSLWELATDALNRALVOSGTPFIVRPGBGAFYGPKIDIHVK 

DAIORTtWTrXOt^FLPERFELEVTTAOGTKSVPVHLHRAIJXISrEm-GILIENFKG 

RFPLWt^PEaVRrITVADRHIPRAXELEEAWKRLGLVVTLDDSSESVSKKIRNAO^«OVN 

YMITLCDHEINEW^VRTRI»IRVINnVSVERrurriLEEmJSLSt.TALL 

CPtx.0807 913950 914879 

CTS80 hypothetical protein 

TLCTTGUiMSE^VFLTAFIWSSSrALSKLVKNASAPIFATX^ARMVtAGAlLAIJ^^ 

FVG rSKKIFLVIVUALTGFYLTNIFEFICLOSLSSSKTCriyCLS PLKSALFSYIOLKE 

KVTUCXVLGI^LGLVSYICYLTFGGGGDDSOPVm?OICLPEliILGAASLASiXMTL^ 

lEKOSTLSVTAINAYAMLIAGMLSIMHSAVVEPWRPLPVODISOFLYATLALWISNLIC 

YNLYAKLUUCYSSTFLSFCNLVMPLYSGFYGWILLGEKGVSLGLVLAVAFMVAOCRLIYH 

EEFRQCYIVS 

CPn„0808 916398 9149S6 

Cr579 hypothetical protein 

LKKLPSWALKSLKRMPOSAEPSUUilKPIIFKGACIAMTSCVSCSSSODPTLAAOLAOSS 
OKAGNAOSGHOTKNVrrKQGAOAEVAAGCFEDLIOnASAOSTCKKEATSSrntSSKGEXSE 

kscksksstsvasas etataoavqgpkglrqnnydspslptpeaot i^c ivlkkgmctla 
llgl*;mtlmanaaceswkasfosonoairsovesap\iceaikroanhoasateaoakos 
msg i vnivcftvsvcac i fsaaxgatsaucsasfaketgasaacgaaskaltsasssvo 
qtkastajcaattaassagsaatkaaanltddmaaaaskmasocaskasgglfgevl^ 
wsekvsrcmnvvktogarvasfacnai^ssmqmsouihgltaaveclsacotc 
ri^aeaoaevucohssvygooacoagoloeoamosftttalotlqn I ADSOTQTTSAI F 



CPn_0809 917794 916307 

CT578 hypothec ical protein 

CmWS I SSS^GPONOKNIMSOVLTSTPOCVPOCDKLSGNETKO lOOTROGKNTEMESDAT 
lACASGKDKTSSTTKTETAFCXX^^AACKESSESCKAGAiyrCVSGAAATTASNTATKIAffO 
TSIEEASK5ME.1TLESL0SLSAA0MKEVEAWVAAL3GKSSCSAKLETPELPKPGVTPRS 
EV I E IGLALAKA ICTLCEATKS ALSNYASTOAOADOTOKLCLEKOA I K I DKERETfOHMK 
AAEOKSKDLECTKOTVNTVMI AVSVA ITVIS rVAA r FTCGACU\CtJ\AGAAVGAAAAG^ 
ACAAAATTVATO ITVOAWOAVKOAVITAVROAI TAAIKAAVK5CI KAF r KTLVKAI AKA 
r 3KG ISKVFAKCTCM I AKNFPKU;KV ISSLTSKWVTVGVCVWAAPAU: KG IM^ 
WNVAOFCJKEVCKLOAAADMISMFTOFWQOASK I ASKQTGESNEHTOKATKLGAOI LKAY 
AAISGAIACAHKTNNF 

Crn_0^lO ? IS 193 917925 

trrS77 hypfjchet ica I protein 

IfeJKKt'KKTKKAV^SKAArVKRVPEEb'OEAA I CX3 LEL-WGOLYK ELF WOTFAGLTDK 
tKt trir,t t AALJrrrLEnLHI.EELTOCLFr^^VQErANFAKELJrJWMtXKNLTr/VNKOMVK 
iWE 

l(.l^l^^Frly:M^:;M::Ki^•:f•I<N.\^K^p|JKP:;A;;^mKT^<r;r'.LALL^\^^ 

KEi: IKKAIi :f J I V\\ :i -:f JC;LLiU>J ILCt JDYLLEEI'rr/AYTFYr.C» :KYNFvVA;UYjLtAA 
AL'fOf r^KYMI i :i :; .IlLYNEAAFUFFLAFPAOPDNP I Pr-Y Y I APSt-LKLaU'EEnN 
NFir//I'Mhl.'.;NNlKFKri,KERt:(JlMKO::tEKOM/V;ETKKAI'rKKl\v;K;:KTrrNKK:X:^ 



mii-L-ONA Mtsmjr.cn He*, 

c : icwu^NLTKApM.rrRR p tot-LOFLT I NC r AACE*/ : EN'.rv'.rATCEL : Er;GLDACADE : 

E I ETLCCCOCA I XtEONCCCTRAEnt U iaWRHArSK|;TlEr^TT7:ZWWyr?rffSff^ 1 
A^ :GKMErOSGIECOECVRTVn«CTlVSCEPCAROLCrtTVrVNSLP¥NVFVKW 
OGDRLClRKLIEjmiLSTANlCWSWrSECHHEIOtAKOCflFQERVAYVMCDHFMODALTr 
DKEANGVR r^/LGG P5FHRPTRCC0K : F irJDRP t E JLF : -XIO,t:DAYAUX PUiRYPVT 
VLKLYLPSSWCDFTT/H POK I EAR r LK EELV*CDC t KEA : VETLACPPC : irRTHOEI EES2 
T/PLPMFPMLET:3D'-^EEESVEFIXNt.FAYGSEC^\':L^^ 

ALMKETLTCATf jKHOHVFDVSWLK:,LWJV-;KF EKOr U;AK I RKLlUJ^iliFMa; 

CPn_0813 920843 9:i?3^ 

pepP-Aminopeptidase P 

TLILWKI^mMSHDR; LRA0RAI^E3iNLDA : LVEKSEDLAYFLHDEAIAC Ili lOOOEW 

FVYRMDKDLYSHtORVPLTFLTOCfWADLSLYVCKORYCKIGFDSASTVYHKrAOROVLP 

.Ct>rEPLECrrEXlRSIKSEE£IRRM0EAAALCSACY*DYVLTUJlB3ITEXEW^ 

AEA3AECPSFPPI lAFCEHSAF PHSI PTDRPLKKCCtVLIDICVXLNGYCSDKTRKrALC 

TPHPKU*ESVP\nWEA0KRAMALCKECW(O)IDAEXVRVlJlEHHLI7rYriHGIC^ 

H I HEYPCSPRCSQVKLESCKT ITVEPCVYFPC IOC IRIEETir lOKNWirSLTARPVISE 

LVCL 

CPn_0ei4 921996 923357 

CT314.1 hypothetical protein 

FFLFFKLSYNriFNLPLTMVOLLSIGYSFVSriAU.«MU:YSPNYVTDLYRISLSAEESL 

OGIRAFPOA£SUJCCACALNFPDLEERLPDLRKELLFU:SNDRPDACCGKFSL01ASSKE 

CYIAAUCERVYLNVTNSSRCPVYSFSPKGVPTELWIECFSVSVtXSlVEVKVRL^^ 

I SKPRDCETLFLNPPANKLOCWEIACFRVDASFPVKOKI RR IGVOXFLLMHOGAEVADKA 

TKERVDFVSS0EENYSRYtAVT;DVU.WDCNCWC7rCCEn?GASSRAPLFEWRIDDKW 

DLWNVGGTOROTISLVKCVPSP lEINEV^REIETTGWRSWSKP rVLVGCORL r LSPDOWV 

UlTAKGWEItt^RADOIODVVICiaTCPLLVrEKIXKDUUIFVU^^ 

PIJ0OGFEPAVASOCVSSm*RSAAAHPGATNRaSS 

CPruOaiS 923361 925622 

9spO/pilO-Gen. Secretion Protein 0 

HVfTRN5U^VALSGKLCCSSGVALTIAE»(ASL£HSCRCADDYEGHA5FT 

OLSKLV^CARjaJUVSGTEDCALWKDLIRRIGEVROYIJtCIEEIMUEIRCKGGN^^ 

VWHPETTrjfNL\nT3YCTEDSIYLIPOEICAIKIATLSXrWKESFEECI.T0ILS^ 

VROVNSWIKELYhWRKEGCSVAGVrSSRKDLEALPETAY ICrVLNSNVDAHTNOHVLiacr 

INPETTHVOVIACRVWircSACEVGELUCrfNFVOSESIROEYmPLTKlDPCOilSIL 

^UA^lEDLTIaJ7SEESLCIJlWL0Y0GRSLFLSCTAALV00ALTLIREIXK 

TVFWY*WKHSDPOELAAII^GVHDVrSGDnCASVGAAax:CS0LNASIOI0T^ 

GSVKYCJfflADSKTGTLIMVVraCEVLPRIOKIXKKLOVPK^^ 

GIJiIXRLCEEVCW(CCSPS\raWAOCTCILEFlJTCCSTCSSIVPCY^ 

NASPSVVTMWTPARIAVVDEMSIAVSSDKDKAOYNRAOYGIMIKHLPVINTC 

ITLEroririTrTCKNHDDRPDVTRRNinnCVRrADGETVIIO^ 

DI PGICKLFC»SSTSDSLTD1FVriTPKXLENPVEQQERXC£ALL5SRPG£REEYY0ALA 
ASEAAARAAHKKLEKTPASGVSLSQVERQEyDCC 

CPruOBlfi 925600 927102 

gspE-Gen. Secretion Protein £ 

RCXmMAASII^OEIXDILPrrFUClCKCIXPIECSSCAITIAHATATSVrAODCVia^ , 
tcPVHF^rtJCEESErtr>RLOQLYStntmMVSEKrj.TVKggnfrr^ 

LUfinUCEAZEERASDIHFEPCEDSHRIRYRtOGVt^RHSPPSHUUALr^ 

EIUTCOTrrAPECILLVTCPTGSGrrrTLYSVLQElJCGPLTNrjfriEDPPEVK^ 

AVXPKIGLTFARCUlHXJJlOOPDIUtVGEIRDOETAEIAIOAALTCHLVVSTUmiDAIS 

AIPRUJ3MGI£5YlI^ATLVGWAORLVllTICPYCKVAYTPEN0EXSFIJ^LCKI7riMPl. 

YROOCCVHCFRSCYKCROGIYEFLRPNTLFRSEVASNRPYHILRETAEQNCFtPILEICI 

ALAVSGETTLAEVLRVTKRCD 

CPn_08l7 927106 928287 

gspF-Cen. Secretion Protein F 

GGRMPRYRYTYLDPKERRKRCYlXALHI0EAREKlJ«EWIOVTi)IREVALRRMSIKSTEL 

IVFTKOUiLUlSGLPLYESLVSIJlDQYHBOKMCLl^TSFMEnJlSaCSLSOAM^ 

FDHFYCSGVAACESVCNLECCLONIIVVIXERAOITKKKVGAI^PCVIXVFSFAVMLFF 

UjCVI PSUCETFEMIEVKGLTX rVFGVSDCLSAYRYLFlXIFASALITVC lUUUlRIPWKK 

ILEKIXFALPCTKKFNAOCVAVNRFCSVASAIUCOCGTLIECLDLCCDAIPYDRIJm^ 

IVOAVICGGSI^OEIJVORSWVPKIJVIGMIALCEESCDLADVLGYVAHIYNEDTOKTIAS 

TSWCOPVIt-IFLGGLIGVIMLAILIPLTSNIOTL 

CPn.0818 928158 928682 

predicted OMP (leader (16) peptide) 

GYTKfA^FDhAAA^rrrRDSOFSVWDRCDHVGNIDPTHKOVPMIIKCVUlCVGMKRO^ 
S ITLIEhO^WITL IG r ICGAIAFNMRCS r HKGKVFOSEONCAKVYD I LMMEYATOCSSLK 
EI lAHKETWEEASWCKECRKLLKDAWCEDLIVOLNDKCDDLV IFSKRVOSSNKK 

CFn.Ogi^ 929117 928956 

CT568 hypothetical protein 

ASLYC^CLFLI WEKFHNN ICKANFHLKI ITTDFLTD I YIVTIROP lAYPLTC IC 

CPn_0820 929042 929650 

CTS67 hypotnet ioal protein 

OESLPCRCCCCTFFR.'JETSr; I RTEWPMCNS I AMKKOKRGFVLMELLKSFTH ALLLCTLC 
FWRK r YTVOKOKER : YNr^ r EESRAYKOLRTLF^MSUISGYEEPCSLFSLI FORCVYRD 
PKLACAVRASU IHCTKDORLELRICN r KOOSYFETORLLCHVTHWt-SFORNPDPEKLPE 
TtALTlTREPKAYPFRTLTYOFAVCK 

crti_082i •».V}u'ii 'nnht\H 

tTV^t'"'* hyt^^Th"! uMi liitit.tnit 

irVNl.kUZtiKf-WjW I FTf .U:Lr:;i ,V:U.7AFI>AAKAnKRCACA0T r EROKNFFf: IKRSACA 
Ef EYOKK;:nM.v:A t LR tr;Kr)Kf :K^yTrKO[AKVATKKKC>RYr<LLUVPFr;Rri*NN.':RYNLYA 

I.L; ■Err*E(rY::nrrA: i k i nLU'.l<^•r/lrrrrtM'^*^r^^r^^ ianal, t::NKOE llergaou; 
riA- ( r:rr;rLn:roAE i FYKMirr :o::ujiKuiYErx;;ijGiit:KLNr. i FMnrt.tXEAVL 

Jim -I lAYRITP: Mi\ : l WhVWKfVKJIA [ 'Wa V*yj/%/yALEt J'KTHTDFra.Et.ltPKMOLLLSRY 

iM.Mi.i jJKKMir;YTi.:::/v :i »yr.Ki.vi*M/pKA t:;r":r«T.':K.': tKi. 

1- KM I VL t:rr I KN I :* (' ni-rHAO^rrir i^3r::::KKr::::wFnr:rrRKVK0Ut::NrKVf:Kv«KFr. 
: :mkai .>:a t* > a i .vt .v. : i i ai *k t : ma^ v :i .i* ( ai 'rr/r j ;fi ive t RKMt^*:r« aj: :y; : r ajiup i k 
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HA iicj;n i.i>vLN t PiiFAv.ir tvTo/ rLSFirr. jstcskohcdkhodtsnkps 

CPn,0H2l 'M2iJ>l T 3 1501 

yscT/3p.jP -YnpT TranlOT-acion T 

inr ALOVR rSKT: t r NGNKEI>C E CLPELFSNLCSAYLDY r F^H PPAYVWSVFU-aJVMXP 

:FAVAPFLCAKLrP3p iKici-Lswuvt rrpr/iJVDTorT:mrawjLrmx'^ rc: 
V tcrvL,\FP AAOSAGo r :tnooc r oclbgat::! i s : eotsphg i vrnwrn r i fwlvc 

• 'HR rvr^LLUrrr.EVr P tH J^FFPAEMMSLSAP rWTTMtKMCOLCLVMTtOLSAPAALAML 



GPn_0924 932677 932379 

yscS/CliO-YopS/CliO Translocacion Protein 

r RTRAVLAFFATSFKSVLFEVSYOSLLLILIVSAPP I ILAS IVCIMVAl FQAATOrOEQT 
FArAVKLWrFCTU(rSC(>rt.SNMIUlFACQirONrYKWK 

CPn_0825 933618 932677 

yscR-Yop Translocacion R 

ER IKVrr IMRS r FRFSLCrrTLSVSCCrADASLreiSCPSRCOPTPPPSNSNPLNWOOP 
VAASSVPSYMP PLNADDVLPRDHLSDGSFSDTYPDITTOAI ILIFLALSPFLVMLLTSYL 
K I IITLVU-RNALCVOOTPPSOVUC lALILS lyVMFPTCVAMYKDARKEI EANTI PCSL 
n-AEGArrvrVAUJKSKEPUlSFLIRNTPK^IOSrYKISOKTrPSEIRAHLTASOFVII 
IPAFIKGOIKNArEIG\n*IYLPFrvrDLVTANVLVAMOKKMLSPLSISLPLKUiIVMVD 
•CWTUXOCLMISFK 

CPn_0826 934382 933612 

yscC-Yop Translocation L 

HD^0alSC^^SSEVN0P0RYYArVKMK^^SLrFKDD0^«PNKKVLSPEAFSA^U3AK^ 
KTKADSEAYVAETEOKCAOIRQEAKWFKEGSESWSKOrAFLEECTKNUlIRVREALW 
LAIASVRKI ICKELEI^Prr WS I ISOALm-TONKH 1 1 ISVNPKDLPLVZKSRPELKNI 
VEYADSLILTAKPDNn'PCKXrilCTEAGIIhlAOUJVOLDALEICAFSTrUCA^ 
S3STDSSSLSND0DKKE 

CPn_0827 93S273 934434 

CT560 hypothetical protein 

CCLVTAOTFGTLDILMKHSKEDDLSRrU>KNLLVESPHP£EIPUCSLSFTMSWLPTIHPS 

WITIAMKEFPPEIOCKJU-AWLPEPLVOErU'IXPGISIAPHRCAPFGJUnrLLIWLSKKIR 

PCGITEEIFLPASSANAILYYTCPVKIALINCLGLYSIAJCElJmiLDKVVIERVKKALSP 

TEKrJLTYCOSHPM)OiLErnJri^SWTrnAHJlOFVHKOCaXFI/nCALTmU^ 

RiUJJVGRAYIVECmJavnrDHPyVDYFKSRLEOCMKVl.VK 

CPn_0828 936292 935267 

yscJ-Yop Translocacion J 

IKRYAWIMVRRS rSFClJFLKrLLCCTSaiSRSLIVHGLPGREANErVVLLVSKCVAAOK 

lpoaaaatagaatbohwdiavpsaoiteaiaiu*oaclprmkgtslldlfakoc;lvpsel 
oekiryoesclseomastrrxudgwdasvoisfttq/etnlpltasvyi fchrcvldnpns 
imv^ikrliasavpglvpenvsvvsdraaysdrringpwclteeioyvsvwgiilakss 

LTXFRLrrrVLrLILF\aSCGIXW\mnCTHTLI>fIWOCTKCrFWTPYTKN^ 
AAADKEKKEDADSQGESKNAETSDKDSSDKDAPECSNEIBGA 

CPn_0829 936729 937298 

No robust homolog present in Genebank/EMBL as of 11/7/98 
KYICFVPTlAKSFYINIRDSRrYSWl^FIMKOTYRDFLHENYIJ<NXKSKFMKrmAGE 
FFLAhIA>WPLVPACYRR\mGKDrVl^PLVOLVILFPWVTKDSRYSPCSyfrFTCICRSrVE 
CI PWSTLFCIGRFCAVV^VBCFSCSTFDKiyHTrVAV^ILClJGILTFILRI IFSVLML 
PVWFLFKCYS 

CPn_083a 937339 937959 

No robust homolog presenc in Cenebank/ML as of 11/7/98 
DSCSFLLPCFEVEAOTFPQVFSKV^AnfKYKSSR ILLIALLYNITLVLCLIF r HKKYLGOK 
G RVILK IYQNEEEFFRATERFPSrCACYLRVRNKNSVLFPFEDLMLVCPSVPKDFPLSAF 
KVTTKLIYWSVLES I PVVCAFFFS ICRLFA^fl«: I EDFPCS IFSRIYHTrVCVDGILCLGI 
IMF I LRIIFTLLTL PFWLISCLKSSAA 

CPn.0831 938249 938434 

No robusc homolog presenc in Genebank/EKBL as of 11/7/98 
NKRKNNVL I RKSESECAFFEATOWPTIQQCYOLVm IREHNLSVRAHFDLSLSLI^^ 

CPn,0832 939750 .938827 

lipA-Lipoace Synchetase 

VMKCRPTLrn'OOPRVRKKLPERFPKWLORPLPOGSAFHATDATIKRSCHPTVCEEALCPN 
RAECWSRKTATYU^DVCTRSCGFCNrGHSKTPPALDPTEPERIALSAKELGLKKWlT 
MVARDDLEOCGAOGLVDI I0KLREELP0ATTr*T-ASDFOCNVSALHTLLDSGITIYNHNV 
ETVARLSPLVRH KATYARSMFMLEOAANYLPDLK IK3C IMVGLCEMBCEVKOTLODUVS I 
CVR I VT IGOYLR PSRKHLOVKSYVTPETFCrrYRKViCEAMCLFVYACPFVRSS FNADMILA 
SVQDKASA 



CPn.0833 



M1171 939747 



IpdA-Lipoatnide Dehydrogenase 

RG\^FE r LITVSENhrrQEFOCWrGACPSCYVAAITAAQSKLRTAL lEEDOACGTCLNRC 
C r PSKALIACANWSH I KHAEQFG IHVIXrm m*P.AMAKRKNTWOGI ROCLECLI RSNK 
[TVLKCTG3LVSSTEVKV IC00T7 r I KANHI IL\TGSEPRPF PGVPFSSRI LSSTC ILEL 
EVLPKKLAI ICCCVICCEFASLFHTU:VE ITV I £ALCH XLAWNKEVSOTVTmFTKOG I 
R I LTKA5 ISA I EESONOVR tTVNDOVEEFDYVLVAIIROFNTAS ICLDNACV I RDDRCVI 
PVDETMRTNVPN t YA tGO ITGKWLLAHVAiHOCV I AAKN r SCHHEVMOYSA I PSV I FTHP 
EIAMVCL5L0EAE00NLPAKLTKFPFKAICKAV.\LCA3DGFAA£VSHEIT00IUyVYVrC 
PUASSLICEKTLAtRNELTLrCIYCTVHAHPTL^EVWAEGALLATNHPLHFPPKS 

•T'/it. hytwtr h<;r. i'Ml ittnr<,*in 

K 1 1 «^^^^KFTE«oIm>^K^:lr.;:v:;Mllv^>>:pYC:V^FL'JD^^VA:y^;F.s.':cH t.'TFPECASK 

Kl!AKDLr'AV:j;;E0Wf7VVL# ;iVNPTORTNK0VrPEWTV/U?:.'WPL»\ALFL': ICLLAFAFLIL 

i.K::rij:;t:t.vL'iwiKNRAYiVi: I I(:/\AVAYR(;VRK:.^;. 

«:nijiH M'.u'iH •^J01^ 

m<M I :Mt/::Nr i.imllv neU'.-.isc 

UN tMVf.l^Atw\ tFROnAMUItr.LKHRKEIWOF':F.D;".T tu [ i rjKrw\ra:YWLCTt.KLODrD 

Mi.*ir,v:' .rtyiB '(xmwTAVFAvvnAi^ :iJiruit;KK('ii::FVf^AVF::nFFLD:;rpU)Ag 
• jKMv-rri .e: : ri i itlt r ii :i :i:evpjdwi .rt m,v:EEr-r/f.TNKTKi.K:;ALYP. pakkfffl 



.TtTNVTVtrAEEAKVNrTI-::!^ XKDRENHPKTRrf;;;VEYVAKTTtEM:TCPKArAL?IYA 
t PU-ADKFK00Lt^LUnr03LEYRLR YD r RLLRDAi* FS FFAVL\TPr:CLONCSL : YPNYC 
YaPTKCI>fOVVCKISPKOAITVt«E01^ES3^ 

EOGVLLFWDVCOPSyrEI RFTrvnTriTiOCFFLEmroLP roocLM 
OAALRRLPNFFSSPPNLKDLLIEVMROSRCKCLOU<PrLVCLCESRCWLFt;m.YRn)lC 
FSt-r PTPUX:U:FLPRVIPPOWPOFLTOYAOHER £ LFPNPOTRPPESYELVrCSIHRPH 
PAS PUi WLEUCTNt^SVp IG I ALOCLKSKHTFLFTCACFLDUCONLFOFLKOFLSTOKC 
V t AENT\' I AN rTDVrKU)ALAPL.GVTCrrT I ANPEOUJFFSOLKAACLPP I PQNLFSSOHO 

i V , V ; \ . • :; . : ,/.wwr: .y mi n- : -.y^u v: \: .i:* : vk • . • ; • a? r «•* - L " — : 

—V :";:.i:nr:..:Nii: ^ •;- •km-i-.v:-.: r/v -fi>vfKTKrvK:Ai^:v 

VFDE IHMAKNK550 1 HK I LCR : IJAL^KKU^Lr^TPI ENNLwEFKCLU;: ILPNYU'SnALF 
KKLrrKRCSSE£L£EIIPS0DLLUaTRPFILRRTKKLVLPEl.PDKVESrrACSLSPt30E 
KLYMATU3REKSH lOKLETPEEPATOFLH I FALLNHLKOICDHPAVFFXDPDOYIWYESC 
KWNXFVKtXKESUiACYKVVVFSOYIHMrRriTLYLEEICIKYASIOGKSLr^^ 
TTDPtCOVFVGSLLWCTCINLTACNWIMYORWWNPA^ 

LITECrrLfXRIHYLIEKKIRIiDKVIASODSNILHMLNREDLLTrLSYKDDWrrSDSEES 
PVDXPVEDOrCVLPPEDS 

CPn_0836 946960 945723 

bmO-Amino Acid (Branched) Transport 

KMKKNASHKINDKKSLSIWSICCSIFAMFFGAGNWPLALCYHYnAHPWSAYFC^ 
VCVPtXCLVSMLFTSCOYQKFFFSIGRr PCMIFITAI ILLIGPFCG IPRAIAVSHATLIS 
LS EHKSAF I PSLPIFSAICCVLIYIFSCKLSRLICJWLGSVFFPIKLVTLLWVI IRSFHI P ' 
THPMVOEFIPNAROAWIACFIEGFNTKDUJUVFFFCSIVLISLROLVAEEKHPTEEEIPL * 
SFOG ISKKNKRSLALCFFLAAIUjGMTYUirVLSAARKAGLLVNVSKCH tLCRISAIALC 
PNSILACVSVriACLTTEIALVCIVADFLARWSFKKlJffASAVICTLIPTyLtSrLNrE 
TISHLIXPLLOLSYPALIVLACCNIAyKI^FRYSPVLrrLTLSLTIVUa.VN 

CPn_0837 947777 947145 

nch-enodnuc lease III 

LTMKOFILRTLNAIJPOTKPSLBCWSSPFOLilAILLSCNSTOKAVWSVTPOLFAKAPnA 
OSILDU>PGKLYOLIAPCGa;ERKSAYIYQt^OIL\mOFHGEPPNDKALLTOUCVCRKT 
ASVFLCrAYCKPTFPWmiLRlAORWKISEKKSPSAAE)a)IJUU!TCHE^ 
YAROYCPAIJOaiDNCPICSYLAKlANSTRT 

CPn_0938 949196 947781 

thdF-Thiophene/Furan Oxidation Protein 

rSI^IYPNSFHLFNIJClJCIt^ESSFNFSIFWJCHDrriAAIATPPGECSlAVVRLSCPOAI 
VIADRrFSGSVASFASHTIHLCQVIFECTLIDQALLLLMRSPRSFTCEDW EF Q CHtXf f 
ACSOILDALIAWARPAIJCEFSORAFUJGKIDLVOAEAIONLIVAENIDAFRIAOTHFO 
GNrSKKIOEIHTLI IEALAFLEVLADFPEEE0PDLLVPOEKI0NALHIVn3FISSFD£CO 
RIAOGTSLIlAGKPNVCKSSUJlAUflK^ 

ACORTIT»roiEKBGIERALSAHEEADCILWVIIUTOPI£DLPKILrnCPSrUKNKADLT 
PPPFtI7rSLPOFAISAJaC£X:LTC?VKQALIQWM0K0EAGlCrSKVrLVSSRHKMIL0EVAft 
CLKEAQONLYLOPPCIIALELREAUlSIGKLSCKEVTESILGEIFSKrCZCK 

CPrv_0839 949230 950159 

psdD-Phosphatidylserine Decarboxylase 

FLriVSRCLVOKPOYIDRITKKKVIEPirraCTKLFLYNSKLCKKI.SVFtOT(PIFa^ 

CW I^RC SWTRRQIRPF»ntYKrS£m.TKPVADFTSFrroFFTRlCIJ(PEARPIVOEaCEVFI 

TPVtXmVLWPWSEFDKFrVKSKAFSLPKLLCDHELTKLYAHGSIWARIAPFW^^ 

FPCOXPOKTRCVNGAIJSVHPLAVKDNFILFCEJiKRTVTt^ETEOrCN^^ 

CSrV7rrSPrK7rYAKCDnCGFFAFCXjSTVILLFl.PNAIRFDNDUJCNSRK»XT^^ 

SLSR50RCEZ 

CPIU0340 9S0I41 95 1544 

CT700 hypothec leal protein 

ISERRNLKruaFFGIAKRDKSOKWRlMWLVILWALAASUVIALVAICCyYRFVVFW 
QVIRE\mLSKEIJCEWALAEQQU.PILKKR5YRROCLFEVmTLftKMr^ 
KLGLRGPYTFLE lAYKAYRrGAFKECAQAFASVPOnLPgFmmcvkgAr.wf /TnT^pi ^/- 

SLIEPWISPLSHOETrvmOTIYFTSKRYKDAIDrifNRANALCVCPVEVTYNLAQAYRrr 
SSYAlCAGKLFRKLLSmVYKEEAIJyiGirEOKLGRPGgAL^lvnggmjrfgnm^ 

AAMAAMDQRDYVIAEPCWELAIJ^rSTFAKEVKCCLCYGFSLCRUUCYCnAERV^^ 
FPECLTACKALAWLCCVCYATLUJSEBCLKYAKKAVELDHSCETLELLSAC^^ 
AYEI0SFLSSRI3TSL0EX0RRS0 ILRILRiCKLPLNDHH IVEVDAIXAA 

CPn_084l 951719 954640 

secA-Translocase SeCA 

IKRHKLCFUCRFFGSSOERIIJCKFOKLVDKVNrYDEMLTPLSDDELRNXTAEIJCORYONG 

ESU)SMI^EAYCWKNVCRRIJWn'PSrtVSGYH0RWDMVPYDV0ILGAIAMHKCFITm7r 

GECKTLTAVMPLYUiALTCKPWLVTVNDYWORDCEWVGSVUlWLCLTrC^ 

KRKKIY0CDWYGTASEFCFDYLRI»iSIATRLEE0VCRCYYFAriDEVDSILrOEARTPL 

t ISGPGEKHNPVYFELKEKVASLVYLOKELCSRI ALEARRGUJSFUA/DILPKDKICVI^ 

ISEFCRSLWLVSKCMPLNRVUIRVREHPDLRAMIOICWDVYYHABQNKEESLERU^^ 

VDEHNNDFELTOKCHCJOWVEYACCSTEEFVMMOMGHEYALIENDETLSPADKINKKIAIS" 

EEDTLRKARAHCLR0t-LRA0LLMEROVDYIVRD0OIVIIDEHTCRPOPGRRFSCXaJ<{3AI 

EAKEKVTrRKESOTWTVTLONFFRLYEKLAGMTGTAITESREFKEIYNLYVLOVPTFKP 

CLRIDHfroEFlfMTEREKYHAIVNEIATIHCKCNPILVCTESVEVSEKLSRILRONRIEHT 

VLNAWIKAOEAEI lACAGKLCAVTVATNMAGRCTDIKLDNEAVIVCGLHVXCTTRHOSRR 

IDROLP.GRCARLCOPGAAKFFLSFEORLMRLFASPKLNTLIRHFRPPBCEAMSOPMPWIL 

I ETAOKRVECRNYT IRKHTLEYDDVMNKOROAIVAFRHDVLHAESVFDLAKEILCKVSLM 

VASL'/MSDROFKCWTLPNLEEWITSSFP r ALN I EELROLKOTOS lAEKIAAELIOEFCVR 

FDHMVEGLSKAOGEELDASArCRDWRSVMVMH I DEQWR I HLVDHDLLRSEVCLRTVOOK 

DPLLEFKHESFLLFESLIROtR ITIARHLFRLELTVEPNPRVNNVI PTVATSFHNNVWYG 

PLELTr/TDSEDOO 

CPn.Oit42 MSS015 'J54710 

'rr702 nypt>ch»icic.il pnicem I tr.ime-ohitc with 0843) 

KYYTFPT ir:R:; pw:;n r aucpi :;EpEYcrNOLUCTor;u.TTrwcn"LLNAPKDFW isknokh 

t LFr:X/x/ff rrt^tlYAOFLI AlWRRKFWtPT/NDOVWnBWTPF r 

rPnJt'-.A: ^^'^ -t'lSJiO -iWiA 

muy. r.ypiuhc'f iiMt pt or.ein < r r.wifr nhi tr with 0«41» 

riKHKL: r.xRKKVMNYixjYf Tprrrpoif ly^ i iTJjr.i.0N.':EFj\A:;LDKYyE7t w/eentooc 

DLL (7l/;F-;VFKf]T I tlOFX' 

'•r'ti_o»*44 ^^'i -•*»i.7M ••.•,;:7i) 
ypii*.* •''rrP i'.w.-/':Tp-huuiiiMi t*t',t 'rtti 

KNft t CT tMLK I A I Lcn I -Nvc IK: :i Awi / -Kc: :i y» ( VN-'ioiyrrrnrw i :o xtJiAFtn/pAOV 
Irypry^/ClIN::eoYFlJKM[YNtv^^'r;.\K^y;li'/IJ.^vtua'^•';tTE[:DAlHJVKLUJXKKPL 
iLVAf ik/j).':rcE£U; ( 1 1 etyk i * ; t I'M -rn-: rpAj f dkm t ucujjr i klvanlpepreeeee 
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• :Lc:Eu;vnr:nKE:;EAALp:;fiTFPDF:;c:vFTEnF;, .xtipespooapftlkialicrp 

Wt;;::;;nTEKA l;;RAD:rLLV [DATOKU;CV£KR:LSLr:;KRKKPHr ILINKWDLIXEVRM 
EHYrKOLRATDPYtrOAKHLC rCATTKP^LKK I F.-TA rOELHHWSNKVPTPI VNICTLASA 
UlRNHPOVtCCRRLR ErM tOKTTTPWFU.r IfiAKSLLTKHYEYYUOrrUCSSFNLYCI 
PFDLEFKEKPKPMN 



...:::;v. : .\ M,\f>;Mv- "^rr- :..:^^,^/y':: :":.;'!IA. ;vyAyr.";x-.*Pi)MLMN 
R PLED to I ATNA3 PT IVST X r PDV I J ICT/AFG I I'/VKOOCRLFEVATFRSDCEYKDGRHP 
OR r IFSSMREOAUlRDFTVNCSrnf DPFE2)K\^DrVECTR0IEKICVl RAIGHPRLRFSEDK 
LR lU^AI RFSSSUSFTLDPTTERAI I KEAPALVNSVSPEKIWOEUCKMLKROPYCALSLL 
UCLKVLIFI FPEUmi PYStlJnTI EFARKFNPTHFPEIUTXPLFOT/SEEAATVAFCR 
LR ISNKELKLI ESWYEALPHFONOSGWIVFWAHFIJ^ITAPLFI^LFSALOKDPSROOHF 
I SRVOELESRLEOF I LR r KTSSPWSAPDLIAKCISPCRLLGDUAEAEILSIDiECLDK 
EKILLLLOEKGFWK 

CPn_0846 9S93B3 958112 

clpX-CLP P roc ease ATPase 

REHM>n<KNLT:CSFCCRSEKDVE}CLIACPSVYrCDYCIICLCSCILDKKPSSTISSAPVSE 

TPSOPSDUlVLTPKEIKIOf IDEYVICOERAKKT lAVAVYNHYKRIRALLHNKOVSYGKSN 

V^LU;PTCSGK7LIAKTL^KrLDVPFTIACATTLTEAGYVGE37VDirW*R 

RAERG I lYIDEI DK IGRTTAWS ITRDVSCBGVOOAUXIVECTTANVPPKOCRKHPNOE 

YIRV^^'EIIILFIVCCAFWLDKIIAKRtJCKTT:GFSDDOADI^OK^RDHLIJUCVETm^ 

AFCMrPEFVGRFNCrVNCEEr^II)EXVAILTEPTrUIVKQYMELFAEENVKLVFKKE\LY 

AI AKKAKOAJCrCARAUWr LENUJIDI^EI PSDPTVEAIHIOEiyr lAQKAPI I IRRTP 

CAIA 

CPn_0B47 960019 959387 

clpP-CLP Procease Subunic 

KLFOEETOMTLVPYWEDTCRGERAKDrYSRLLKDR IVM IGOEnXPLAWTVI AOLLFLM 
SEOPKKDIOIFINSPGGYITACIAIYiyriRFLGCIJVNTyCICOAASMGALlI.SACTI0^ 
HALPHSRKMIHOPSGGI IGTSADIOUJAAEILTIJaaajeilLSECTCOPVEKIIEDSERD 
FFMSAEEAI S YGLIDKWTSAKErNKDTSST 

CPn_0348 961556 960177 

ciy/raurl -Trigger Faccor-pept idyl -prolyl isomerase 

VQASSPAFPFKShnOCGCLVPRSLSNEOFSVDLEESPGCIVSALVKVSPEVUIKl^^ 

KIKKZITLPCFRKGKAPDDVIASRYPTN^mKELCELVTOnAYHALSTVCDRRPl^PKA^m 

SNSITQFDLOECAKVEFSYEAFPAISDLPWENLSLPOEEAASEISDSDIEKCLTTJIGMFF 

ATKTP\^PS0EGDFISISUiVSKSNDENASSAAIFENKYFKLSEEEKrDArKE3CM 

TGHRWETITSPEIOSFtACCTLTFTVKAVlEVSIPEIDOEKAROLOAESUDOUCAKLRI 

QLEKOAKDK0U3KRFSEAEOAIJWLVDFELPTSLI£EiaSLITRnCLLNARLI0YCSOEE 

LEKRKSELIKEAEEDATKAIJCLLFLTHKIFSDEKLTISIlEELOTfMMDrc 

DISNDTLQELVMSARDRLTYSKAIEHVLRKAn^-ASTPSA 

CPn_0849 961752 965285 

mocl/snf-SWF/SNF family helicase 

ADYIIHSYSRCEMLrn^RKU«U3FSANIL0aaCia.FE0GAVTDAKILSMNGE^ 

CLYIKIYECEIEVDRSESDTVDSNCIX:SYN«X:OHIVALIJTl^yFNEKVVA 

ETDHEINEEVKKEIJCETrVAAATKEEERKDREHOKEIUlEyVHAANALSANPrrL 

EKnSAEIAVLFVSVNEryrPAPAWPIEFOLVUUJCRSKPFyrSNIRTFLBGVLY^ 

LrCRRFFFTMOSFNASDRKLIOLLIRYVRYPNHrrEEiaiJCSAYUiPPALGVILAIO^ 

0LADRGCX;SLGEKESFSGLFCCNI£EPIOrSLTPAKMKFNIJ3FFI»«PYK^^ 

DDEVOPETTMLLESDAPGIIHHFVYHRFSPOIKRAHLRSFSRIJIDIAIPEAIJ^SFF^ 

LP\^EYAE I ANVKIXNSFVTL PYVDEVRAIClHSYUX3ELEAKU«rLYGSLRVPAASLA 

WyODVRAFISDBGIL^R^^LVEERKML£EVFSGFIYD£IUXytfRVKSEKKIVEFW^CTIP 

ANQHRITFWCPENLSGOFIYDETIFEl^FRBCSDINYyEADIJCSmGLLKCVPLDLI^^ 

SAKp^t^PKACQOSKGTRRGKWSCKLPCILVU)LEKlAPW0IF^ffirG^^^ 

KCPrJtfSLTCISUX5FEALPVfff'SMSERLIEI0K0IRGEIEF0P0DVP0QI0ATLRSY0TE 

GVHWL£RLRKMHL^GILADDMGUac^L0AIIAVT0SKI£KCSOCSLI^ 

FRKFNPEFRTLVIIx;VPS0RRKOLTAIJU)RIJVAITSYJaXQKDVELYXSFRFT)YVVLDEA 

HHIKNRTTRNAKSVKMIOSDHRLILTCTPIQJSLXZLWSLTDFUiPGLLSSYDRFVCW 

RTX^^YMG^nCADNMVALKKKVSPFIUlR«}CEDVUa^ 

ASAKOELSRLVKOEGFER IHIHVIATLTRLKO ICCHPAIFAKDAPEPCDSAKYDKLKDLL 
SSLVDSCKKTWFSOYTKMLG I IKKOLESRGI PFVYLDCSTKKRLDLVNOFNEDPSLLVF 
L I SLKAGGTGUJLVCAimriHYDMWWNPAVENOATDRVHRIGOSRSVSSYKLVTLOT lEE 
KILTLQNRKKSLVKKVrNSODEWSKLTWEEVLELLOI 

CPn.0850 965254 966390 

mrea-Rod Shape Procein-Sugar Kinase 

LCKKYV^CRYDFMSPHRNLFKUCNFSNRLYNRALGRFOKVFrfFFSCNVCXDLCT^^ 
YVRGRGIVL3EPSWAVDAQTHA\JTAVCHKAKAKLCKTPRKIMAVRPMKDGVtADFEIAE 
CMUCALrKRVTPSRSVFRPRlLIAVPSGITGVEKRAVEDSALHACAOEVILIEEPMAAAI 
IDICGCTTEIAI ISLCCIVESRSLRIAGDEFDECI INYHRRTYNLM 
ICPRTAEEI KiT iCSAYPLCDOELEMEVRGRDOVACLP ITKR INSVEIRECLAEP lOQII 
EOrtlLTLEKCPPELSADLVERCMVIJWXCAL I KGUSKJ^KNTCLSVITAPHPUAVC^ 
TCKALEHLCOFKKRKGNLV 

CPn_085l 966378 968195 

prkA-Phospnoeno I pyruvate Carboxyklnas» 

REFGr/MVWSTNIKHECUCSWIDEVAKLTTPKDIRLCDCSOTEYDELCTLMESTCTHIRL 
CRTLYIVPFCMCPLDSPFnrVCVELTDSPYV/CSMKtKTRKIDDVLRSljCTSCKFLKCLH 

^'^^^I^JVJ^^S^^^^^^^^^^s^^scyS 

l.-jijl^^j; ""-f^ •WI^Um/OELF.':VDAECWL^E\,'E^I tGEVUC I FTirTDCPOO ITOELLR I K 
• iTijiHM: 'tt»n274 

TV i I ttyfXit.tn:T UM I pfOCr.' ifi 

! Ki J' { f urm . rfmm;.>p::Y iNFTPNVTTAUJocK tur:h\ t er j;( : jalffoelqdkaoc 
riiAu/nALMi.it -LtFVKE lEALKAAOArpKrjKv-imfwjK ^^fPI•/^^^«ovL.•;YPVTDYU^ 



LNL3C3WR0LTENMLPNT::'. jEi rA0iR5FCNt;vNCT I :/\jN7U.rrrHRLmL:r/ 

t YTYOCCAT r FCHSYCT JTPAXOP^*^ I CA INOEKSYWOAR ANnFSVT.-TOCVFOfjrATNICS 
CT3YRCIDLFXieilCVireniPTFL:.-D/(M:nJlYPVflUC™W 
SCWSTOIATFQTOKrraEXFStXKYfinMKANKESPinT 

tASl^IOKnfSNKAAKYLflELIKEITTFOSADIYYStJ lYLKCMNUJAVADP IGKAVC^-L 
NDEKTRAMADITRCNK I KAAI DKMLVE: KAIJAELSK30 r RELVrrrLTOFKSCSOCLIR^ 
SCLLCFt^LTLKAVNOFriATYEArTAErrrEPFWNWKRCU^TFESFVtOCCCNSrTroc 
OOOU>0AME3.WOFTrni0WI-W-CLES3AM0OEV^LVSAALAlXN0HV3KIAR^ 

CT712 nypocnecicdi protein 

MIMHPKIEKRNSLPLTA|MPVFEESYHP3VArrVt3YVOATTL3IWLT^^ 
LCKAFLTSMKOCFINTCTEIAX XOASLADOSSRESRKKEEKIFKOHLCKAAPOAATATSG 
VQPTADPVADKMPLOSAFAYVLLOKY I PAOEEALYALCRELNLSCyAONLFSPLLCHIKS 
FVSAPINVNLGSYrSOTSGTANFAYCYEWILSRYNNEVSOCRLJDIASTWCAKAA^ 
SVKANVSLTOAOKKOrEDI lASYTKSUJVrHTOLTDVmTOASITrVPCUKY^ 
OCDLSIIALOraEKVLWKVDITTAVNECGU/JFPTTTLTDV^ 
LKAMOOawSLVSASLKLUCKYTrVISCnCN w".^«*u«w»ww*«**** 

CPn_Q854 972849 971806 

ompB-Oucer Membrane Protein B 

C:NSYDLFAAIA3SLKFC^^GDYWSESAHITNWrTSVT^SGTO^mITSTr»^TIF^ 
lYFOATDCNLSYKEWSASIG rSTYLNDYVLPYASVSICtrrSRKAPSDSFTELEXOFTNTK 

FKIRKIT^lFORv^^FCFGrICcrs^w^!rv'svrawcYORArNITSCL0F 

CPn_0955 974001 972994 

gpdA -Glycerol -3 -P Dehydrogenase 
GLM^HICYLGMCrVCrOJ^UJWKGY 

I^FTn»«KEAIHNAraiVBCWTSACIRPVAEOlJCOITDLSVPFVTTSXCIEOin^^ 

IKI£VIXn>SVTPYLCYI^PSIAKEVLrCSPCSVWSAyDSOTUCOIHBAFS 

OTDIKCAALOSALKWIAIAaJIABCLSPOOIAKACLVrRCUlEM^^ 

GLA£a/a3LDm:FSESSRNLRrCHtXAWLTFn3AKAIUGM\A^BGAY^ 

tDMPrriCrimVLYMi)lJCIX:iALLLORNrKEEFL 

CPrv_0856 975410 973995 

Agx-1 Hooolog-UDP-Gluco«e Pyrophosphorylase 

GSRimNVRLTWrESVYSPSAKHVNSLAOKUCAINOEHILDIWPSr^PKOQOM^^ 

VDIOFlTUCQCWLLSSPTMlJCDFHPITSrASSCEDPEiWlAGTIIiKEI^^ 

GSRIJCCDCPiaHJPVSPIKKKPLFOLVAEKVRAASKIJtf;OPLPL^^ 

ESNDmaJ)PrCVDFrCOPLWPIXTLSCDLFLEI»mTIJUIJGPNC^ 

WKNWIEHVSVIPICNPLAIJ'FUWlXXirHAMSNNEVriKAAUWAlETC 

GTOVIEYSEIPONERFAUaraCIJCYCI^ICLYCLSHDFIRHAAyoo^ 

QUnfTSUJEKKAWCFEEFIFpiJCYSWrc^ 

DRQIQLFHKVTCKKLS PWrTrELEADrYYPSTSTSLHWDJKAFFEEPFTEAS 

CPru0857 975808 97S392 

CT716 hypochecical procein 

LLLLR0YIICTARCISRI/aU5RIjGSt^LILICVKIHmjyrLHN0KRI^ 
lADLHLERYEHFISRPJIIMYDILUYLKTlJQSSLYKOOSESLRFlJTt^^ 
KI lEXZIOMCYSKOOEZGT 

CPnj0858 977115 975757 

flil-Flagellum-specific ATP Synthase 

RNSETRNORRTRPS^TC^DSM^mLNKaCLHrH^WQPYRACGLI.SKVSGm.IEV^aJ*^ 

GEtCKISSrKDPNU^EV:GFHNHTTU>!SLSPEJiSVALCTE\^PLRRPPSUtt.SOHL^ 

RVUJAFCNPIDKKEDLPICrHRKPU^U^PSPMMROPIDOrFPTCIKAXDArLTLC^^ 

CVFSEPCSCKSSLLSAIALCSKSTINVIALIGERGREVREYiaWSKALICOQmilAAP 

AHETAPTKVIACRAAKriAEYFREOCHEVLFIMDSI^RWXAAX^EVALARCE^^ 

ASVFHHVSEfTERACWmKCSITALYAILYYPKHPDirrOVIJtSLLOWFrLlSSS^ 

SPPIOXLSSLSRSAOAl-W.PHHYAAAERLRSLLICVYNEAXJ)l IHLCAYTPCQOEELDKAV 

KLLPSXKAFLAOPLSSYaxarrUCOLEAUUJS 

CPn_0859 977597 977055 

CT718 hypothetical protein 

Vn^LVTTPOSPGSLSOSHLPHPHDPWDTEPTSLPEDPNDKASOEXJtSLVHLFRKLSIHlXS 
EVDnVO0UCPDU.ELAU.ICEm.YKJaENPOELAUXSTAL0RHTTLRSLTPI!CVFLH 
PEDUCTLTDWXSTHELPMIKHAEFFPDTSCRRSGFKI ETPW IU10EXSEELOHLI5VLT 

CPn_0860 ^"^9639 977608 

CliF*FlagelIar M-Ring Protein 

RTLVFFOrflJOCKLTALGX j FLCCLLIGCWSCAXU^RSSNPSLAPTOVKT EK l^l iW ^ 
LTOMCNPKLI ESLTKKECLEKDLTSFHP I ASAKVAIALSTEODVMSPLHLSVILTLRKEE 
SLTPSLLFS ITDYLCSSLrCUCREHISLSDNLCWLYI PE3XTVNSLFIKTLENYICKIFP 
KEHFALAYHAKAEKPTLCLTLMENYXAHLTKEESEKIVAHTKHYLYONYDOSYOXVICTL 
PFARLOrWKSFPAKVL IC5M I LVISLMXVALASr/LARHAYERVSPEPRKXKRGINISKL 
LEII0KESPEKIALrL5\*!.rPKKAEALLWRLPEDLKHQVLKYKL 

CPru086l '»~«752 V78025 

nifU-NifU-relateil protein 

ASYPPrWKF[>rrLPLEFMIPWSSLSAK*/MKKFLTPHCACTFSEEDAEAKEAHLVTCItOCH 
RLIOrVTFYWLVDKKNCV I LOAKFQYFGHPYL I PLAEAVCNLVCCKSY5EAYKMTL00X 
DKSUlVHAHOPALPEas I JLYH FVX DALDTAVEOCLEI P LEDCSLPWNGPMNLDFEBAN 
PYSOSDWEALTHEOKLVALS.\TIAEKir;PY t AMCOCEVT^/ESLENFIVT lAYSCNCSCCP 

':Pn_0fift2 *»';i)-(24 jT)7?.2 

ytha tlit'S-relarisl p:or«sin 

nnfTV t FR ITCXIKT:^: I ::M=:K rrjriRKA PP r FWUIHOVA l p PGERVK t::%'AU UO C F3LPPC 
.'TAUtytEKTEE!: lRyLU:LKO::ilIFRF-/PMFrirAni IVI JVALVENU;MFl>;RNIf I ILPAH 

owLLlttSUTRHOnu rmTW\rrvNHfy';iavEtX;Ut ETU^TR::lXF:;t^:AAW 
LOPtx::LCKDRR iuj»xr::r)tiy:RArLTri: ii;iAOt tTF;;nAAu>3^::i(x:rFrRK':L 

ERVFS3^FPP»tT5;A;:UT::AVAAMCTA^TFJ^ I ::AM'r.FTFHT:*^ti'KKL I0EL4>:n/Lp'; l 

ot>F:;r/ONRLPN i waa i td r i-ae: : t-\niu i vy; t YP.-i u;YERa>rL,vuvLof* i; ( • :'pf 
tCH:yajiFr:LTER::Kui^;*;:K(^H,\Mni.Ai KI itTruja';;;^ 

fxpnA Hu3tfphn(|lv<'*'f .It •• Hiit.tr.>- 
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EHMALL [ LLr'HC0SVWNEKNLF.7CWVOtPL500C .f SACRAIONLP rDCIFT3TLVR 
.;Un*AULA^^r^mKSKK r PY mtEXIPKAXEMSR 

"KNKK0TAEOF^EER\^LWRR3YKTAPPOCE;GLrDTKORTLPYFEKNILP0L0NCKNVrV 
SAHCNSLRSLIMDLEKL^EED/LSLELPTCKPVVYOWKNHKIEKHPErFC 

CPn_0f)64 9fH6Sg 992374 

yjbC-predicte'i oseuaouridine synthase 

YGV-NVTKVR UIK FlA^ATTwrASPPKCDEI IFSCS^^V^CRVAECP^VX'/DPEOKV0VGC7rS 

V!;vr:"".'V:'y "'"A : ■ ■/iTKri ■ ^:r:/;::.:.^A:tL.^Y^■v^^.'^^':,^/•:ir:; ;:.:«'.t:; 
; • :r.~\iJr* ;;!(;• :.:: r/:"-i v.tAKri.' :K;>!EsTrr:E>;Kiiv?r''*"*-":r-"'rr/'K 

■ IWSECKKHErRLfADAACFPILEUCRIRICSLVLMLRYGEYRELTOAELCrrYHKLSD 

CPn_0865 982412 982942 

CT865 hypocftecical protein 

SPMCYVFWIAGSirLGISUy^YCOLYYSVKSVI^SWyiXTVYALEKRHALI-^LSOLVGE 
EDAOSOKEIDFt^OCDKLSWRAFLKNSYEI r PTFKQiEDLLSERVDCrLES I CTIAEHDR 
A I LC IENFWASKNLJT3FEI AAYEEAVEKYIJCUlORAPUUJtf KIXRFIXW 

CPn_0866 993494 9S2916 

birA-Biocin Synchecase 

NMKVI YYEI EE I PSTrrmAKSYWUroPYALTVISTKCOTACTCKrCKSWKSSKGDLLNT 
FCFFIT0U<IDVSRLFRIXrrEAWMrKDLCITEAKIKWPNDVLVHCEKLCC7VLPCT^ 
ECLIX;vviX;iCUK2nTK0AIJCDVO0PATSL0EIU;HPIDIXmEliIHHI^^ 
PDSLATKSNRGNI 

CPn_0867 983405 984667 

rodA-Rod Shape Protein 

CI RI POMH IGFCHCVRCKNFFYFVINNFHILEIYSLLNSNT rMRYHKVFRYVKSWVFLW 
LTLMULSWVISSMDPTAMLVTSSKCLLTNKS IMOLRHFALGWWTFrCAYFDYmJKPW 
AWVLYFFMICALVGLFFVPSVONVHRWYR I PFIHMSVOPSEYCKLVIVIMLSYrLESRKA 
DITSm-AFLACLWALPFFLILKEPDI^TALVLCPVTLTirVLSNVHStiVKTC^^ 
IG I IGSLLI FSC I VSHOKVKPYALKVIKEYOYERLSPSNHHORASLISIGLGGIRGRCWK 
TCEFAGRGWLPYGYTDSVFSALGEEFCLLGLLrrLGLFYa.ICFGCRTVAVATDDPC^^ 
AACITVYLA>IHVLINIS»tCCIXPITCVTLILISYGGSS\aSTMASLCVI^ 
Y 

CPn.086B 986733 9B4670 

zntA/cadA-Mecal Transport P-type ATPase 

nfr^clgvrdu^hfreyyli inei i itgryvfsrlfftsfsaevvntrfesgmsectspl 

lskonrki^knlpucsayi^urryliali^fmu£aknlsnu^^ 

n icokvvnidi lktsaafgs i f iggaixcauivr^aisealgowsgk^^ 

apttcwlvled©*lokvainkievgniuiikscevvplix;eilhgsssinlmh^ 

kschpgs ivpagahnkeesfdlrvlrtgsdstiahi inlvioaqnskprloorldkyssv 

yalsifaiacgiallvplftsi pllgposafyralafliaaspcali laiplaylsaina 

CAKHGVlXKOGVItI>RLVSCNSV\rtffiKTGTLTTGELTCIGCDYrGSKNE^ 
SSSHPIAEAIVSYLWEOKVSSLPADRYLTVPGBCW^GYTNBOEArVCRVrrcLCaCVPS 
LEDIEOKIYQAKOHGE ICSLAYVGNSFALrVFROIPRPQAKEI I0DLKDLC7YPVSMLTCD 
HKVSAEOTAEIUIISEVFFDLTPEDIUAKIREIATOROIMMVGrciNDAPAIJWATrc 
MCEACSATAI EAADIVLLHDSLSS£.PWI IQKAKC3TKKWS0NLALAIA1 ILLVSWPASLG 
1 1 PLWLAVILHEGSTVT VCLNALRLLKS 

CPn_0869 987479 9866S8 

CT72 8 hy pothetical protein 

EXWRFTTPKTSEOTStXiniOHOILRKIffroDPHDHFKSf^ 

TFKCrrifHLANNALSTCVFIFFIRTLFFLIPTNRALC3^^LISLCVCW 

WAYMELSHRSML£EKNEIEENFE0EiaELRILFOOGFia3PLL0D<VEYVCSDSTU^^ 

MIREELYIRKEDLPHPLICXX;SRlIXX;LCGIJVrFt^LVLCISYTLACVrSAI>IVLVI^FL 

KAKIUCNDKISEMNArfVLCIFITSASriSSLMKLL 

CPn_0879 989881 997448 

serS-Seryl tRNA Synthetase-2 

TTTHPTOGFGGAVI LPF5P IS I ARRIKKSCCSEKSSIYSHFCTLLLNNETSMLDIKI IRK 
TPEECETRrJUaCDPKISLEPVLSLDKEWOUCTDSETWAORRU^ODIHKAJ^^ 
NLIOEVETLAADLEKIEOHLDOKNAOLHELLSHLPNYPADDI PYSEDKACNQVIKSVCDL 
PIFSFPPKHHtXUJOELDI LDFOAAAKTTCSCWPAYKNRC\a-LEWAJXTYHLOKO^^ 
OC^OJ'PtXVKKEILFGSGOI PKFIXWYYRVETCBOYLYLr PTAEVVLNGFRSODILTEKE 
LPLYYAACTPCFRREACAAGAOERGLVRVHOFHKVEMFAFTTPNODDIAYEKMLSIVEEM 
LTELKLPYRl^LLSTCI>lSFTASKTIDAEVWLPG0KAFYEVSSISC?CTDrOSRRSCTRYK ' 
DSOCKLOFVHTLNGSGLATPRLLVAILENNQOADCSWI PEVLRPYLOCLEILLPKDQ 

CPn_087l 988766 989899 

ribO-Riboflavin Deaminase 

EYMEDFSEOOLFFMRRA r E ICEKCRITAPPrnWGCVVVOEhTOI IGBCFHAYACGPHAE^ 
LA rONASMP ISCSDVYVSLEPCSHPCSCPPCANU.IKHKVSRVTVALVDPOPKVAO0GIA 
MUlOACI(^nfVCtGESEAOASLOPYLYORTHNFPWrrUCSAASVlX;OVADSOGKSOWITC 
PEAP.HDVCKUUESOArLVGSRTVLSDDPWLTAROPOGMLYPKOPLRVVLDSRGSVPPTS 
KVFDKTS PTLYVTTERCPENY I KVLOSLDVPVLLTESTPSCVDLHKWEYLAOKKI UJVL 
VEOCrrrLHTSLUERFVNS LVLYSGPMXLCDOKRPLVGVIjCNLLESASPLTLKSSO ILCN 
3LKWWEISP0VFEPIRN 

CPn_0872' 989903 991216 

ribA4riba-CTP Cyclohydracase & DHBP Synthase 

KER I FRVACLASESVNARESMI ETREEVCSANFVSLEPAIEDLRACKFVI WDEASREDE 
GDLt I ACEK ITVEKKTFIXOHrrcwCAALSOERLLSLOLPPMVKDNRCRFKTPFTVSVD 
AAHGVITGVSAADRTKVVOLLAOPKSKPEOr IS PGHFFP LASSPCCVLKRAGHTESTVDL 
MELACLOPCCVLAELVNEDYSMMRLPO I LEFARKHN I AV I PVTS I lAHRMLSDRLVSK IS 
aAP.LPT I YCOFT tHVYESLCECMOHLALVKGNVAGKSir/LVRVHSECVTCDI LGSKRCDC 
GE0L53AMSY t AEKGTCVLWLRCOECRC ICU:HKVP>YAL0DNCYCTVDANUAMGFPVD 
SP rrOICAOaVDLKLTTr KLITHNP0KYFOL0GFOL3ITERVPLPVR ISEDNEOYLRTK 
OEPMCJHWLDLPCCNNRVQ 

r ^.^E -Rihiryl lnin.tr ine :Vncn.nn»! 

t.:HA-/r I* ;YNNFEE\*MKTl.K(;HL:'*,\KNtR t A I VCL'CFrCAMADALVlVrrOCTFLKFCCliE 

1/ lUfr 1 9'J[^ ;afe t nrr ikku.::: :i:;hKFDA[ VAcnvL lOCtruirYW r vnovA/u: [t u\lj 

LEKCLP ITI,:: I V,\Ar::AE I AVA^R^i E KCRHUiVSCMTTAI EMATLFTO I 
♦T/';': tiypttrhi.-r umI ptnr*rin 

f.f :;Lfjt.K iLTKOHtmRFAJMLK ILK rKvr.vFPLALQ'f/.'ft.'; £f;YA(:r\\::;Lf/rN;;QTKVK 

r^.-rFT/W r f-VKCRUYPC: LLMLTET/ snArLLTrrPP r OM/xYnEKLFMKKVrALD I A [ r<;JM t(lL 



hllioc:3POTw:l^ tLP;:. xrFvoFTTAitKyLLrFLK/r k - FrNT:.iT : letaivl 

RHVa:3AKA'/TTFKPYrrC.7CF^jrrAKALH^/LRTFrEL.:r;n-ARLJrE0CE\'LL5LRRL 

0NYDSLLrrt.Tr/p:rA0u::3VttTOTjmciw:c:rLYLrdLCniP- 

(X3HAT I EEAFGP YFTTfWNRL□FEX^^mcmL^'RCATOClL J f 3CASTLAWSriC^^ 
EA£WLVNSrn'/CCEHIPLTFRCLP3LVACL3VATKC3r/3PENRLRCLYSTML3U.VKS 
LRSHREMLNK0LLP0CTVLDFSETT1.3SCGLCVFAEG I AV-R IHUJCAVS INL 

CPn_0R7S 1'.U'i3 I'MOZ: 

RL JRRSRftLFAP.HC^T wKUfLw'Vw'ANFKTVAEK ; JE^DERCL JtV/SSAAk^^ 
OGEIKOALYR IP EVH PLALI EALAENPAL lECMKKMQCRDWIWNLFLTCLSEVrSOAWSO 
CVlSEEOIAAFASTUILDSCTVASrvOGERWPELVDrVtT 

CPn_0876 994123 995517 

da9A-D-Al an ine /Glycine Permease 

S lATCETKLYFI EOLNKLSTSFCVFPM lUXCCFLTWKIJlCLOrHCUCLCFTILKLQM^ 
DSSSKAWEVSSYEAVAGILACNFCTGNIACHAVALAgGaPCALVWV^tAA ^^^A ^VlJV^ 
SYU;SKYRKPEOnT:EFIGCPIACIAFCMRjaCILACFFALFTIWAFCAOJCV0VS 
LCAECTPCKLLVCILI-ALWr PVLACOWRILRFSARVIPFIAGFYCISCC I ILFOKASA 
ILPAIKLICSSAFCIKACLACIOCYTLSOVrSTCINRAVMATOCGSCMVSILOAWriCSIQI 
PVVtCLVTLVPPVIVHVVCSrniLVLmGAYSSGAOGTLMVMSAFlOlSlCSLCSV 
AMALFCYTT ILTWFACAEKSLOYMIPGRRANLWLKA lYVLI I PLGCVIDKRMIWALSOTG 
FSGM\^LrCrALIAUJa)VLSTNRrVALUCERECSVADPVRNLDA 

CPtX_0877 995521 995982 

ybcL family 

RRRIMOLLSPAFAYCAPIPfaCYTCOCACISPPLTFVCVPGAAOSLALrVEDPOVPK^ 

DGLWIHVaVYNLSrriTM-AECAEIFAVOGLNTSGKPVYECPCPFDKOHRYFFTtJ 

VLPEEENVTRDOLYEAMEFHIIEOAELWGTYDCS 

CPn_0B78 996660 995992 

SET Ootnain protein 

GCMSTVTTEPCSSIHISL^^^DWHDSOPVSLDRASELLHFRFLPSL^^SNWKVrtQOIET^C 
HKSEKRRLISPLAKWLGKUiKODLLCPPAPPVWCWINAHVCYGVTARDEIAPWr^ 
TG ILHHROAIVWDENDYCFRYPMPtmilVrTIDSGKQGNVTRrriWSEOPNAEAlC^ 
BCLrHVIIRTVAPIYACOEICVHYGPLYWlCHRKKREEFIPEEE 

CPn_0879 997463 996645 

yycJ-metal dependent hydrolase 

YRIWflCVSMCX5rFPLASGSKCNSAYLGTDSCKILI0lX:vSKavVTREII.SM!«IDPEDI0A 

IFVTKEKSDHISGIKSFVKAYNTPrvCNLErrARALCHLLDSHPErKirSTGSSrCrODLE 

VOrnJVpHOAVOPVATIFHYREEia^GFCTDUWVTSWITHELVI^^ 

OSORPDVYKKRVl^KLGHISWECCQUiOKIITPKLKKLYlAKLSTBaff^^ 

SIASITSXAFEIALAQGrrSPIYFSRLEVACPR 

CPn.0880 . 999864 997444 

. ftsK-Cell Division Protein FtsK 
PMIRERKKSRHPRIJTLPLAAKASLYtJTACFSCt^LWSFHRDOPCTOWrtCUXa«rSS 
FLLyFFGAAAFFIPLYFLWLSrLYFRRTPRPLFrnCAAAFLSLPFCSAIU,SKLSPVCTL 
PALLImlLPK^ILC^©^»PVSYVGCIPFYL^yEO0SFCLICHLICSVCTALIrcn«L^^ 
YirOGlALLKWrrPODCVKKAFCSFFCTrcnCNLKKLINR 

SOPSPRRVSCTIIUCSISPLPOEElPCSWCESFFLTPHPCKRfLTKPVEPOOOCAItra 

TIAl^STPTVNnteSKCKERAALPKLKSLAVPQIDLPOYHLLSKriREARPESLO^^ 

LILXaTLT5FGIDADi;a<ICSGPTLAAFEVX.PHSGVKV0KXKSL£NDIAU^ 

APIFGKAAVGXEIPrPFFQAVNFRDLLESYOlCTtmKLOIPLlXGKKANSClNl^^ 

HLriACTTCSGKSVCir:rrVMSMIKTTLPSEIKLVIIDPKKVELtCYSOLPHKL^^ 

SREVYNALVWLVKEMESRYEILRYLCLRNIOAFWSRTRNiCriEASYDREIRElWPF^ 

IDELSDLLLSSSODIETPI IRLAOKARAVCrHLILATORPSREVITCUKAMrPSRISFK 

VSWCVNSOIIIDEPCAENLMGJKnWLVIXPSVFCTIRAOGAYICOEDINKVIOD^^ 

TOYVIPSFHAFODSDSDNSGEKDPLFAQAKTLILQTGNASTTFLORKLKICVARAASLID 

QLEEARI IGPSECAKPROILIO^<PLEC 

CPn.0881 1005646 1006209 

No robust homolo? present in Genebank/EHBL as oC 11/7/98 

NKKFAVHMPVPIDNSSRNLOEVPESLEDLEQKAEESPTHOSAESSSLOLSLASSAISSRV 

EQLSSLVLGMENSDFSSLRDVPIFSAIYESSTHTPWPLVGWYINGSOSCVYDTORES 

U<t^U/;SRRVEVVYWNniEASLUnXPRRPRRDPSPISLALIXtirfEAFri.EHPPCS 

TFNPIFFW 

CPn.0882 1006169 1007404 

No robust homo log present in Genebank/EMBL as ot 11/7/98 
NTPOVALLIOYFFCNGAFYVREALRLTPHAON IVLVG ICPSLYPEHPRSEYYRVSCOrCS 
RFDDRGFVNSCVETLPYSSCSFGIFWISFTDPTFWFAIVNTFMRTAGINEVSRPKTODTE 
TSLIEMRDLSEOOEANfn'DSLEOEESLMGIVCHTVCGVSMTVTSSPNIFYRIOTLUa.PE 
TLAEAEENPTFPNST I DSLAEIMMNL\m I SDAVS IFWI FP IVOTTYNCVLLAVCIGFFCI 
NG ICSTFLMLTNPRSRRORWRNLR IMVLCYRSLCSCWJLFDLSKMVRMAARRHVTSCTVA 
LYAMVTLFCVWAI0OALOVGFPSVRDAFYRYCLRHRYCLT0Rm)SUyrTCTRP0VTRT 
HLEDQOMVASILNLSVFGLFPCFVCLWTTFCCLEISPSCRWnAANNRTVGIF 

CPn_0893 . 1008904 1007S73 

dmpP/nqr6-Phenolhydrolase/NADH ubiquinone oxidoretluctase 
LYELFIKSC lFU>rrWLSGLYF tCIASLIFCA IGV ILACV tLLSRKLFlKVHPCKUtrNO 
NEELTKTVESCCTLLVCLLi'SG I P I PSPCCCKATCKOCKVRWKNADEPLETOP.STFSKR 
OLEBWRLSCCCKVOHDMSLEI EERYUIA3SWBCTVISNDNVATF t KELWAVDPNKPI P 
FKPOGYU}ITVP3YKTNS:;DWK0TMAPr/YSDWEHFHLFDOVr0NS0LPADSANKAYSLA 
iYPAELPTI KFN I R t ATPPF I NCKPNCE I PWOVCSSYVF.ILKPCDK tTVSCPYCESFMKD 
DDRPL I FL ECCA^::;::FnR;:H I L0LLLHKH5KR EI DLWYOARSLKEN I YQEEYOILEROFP 
NntYHLVL;:ErLrEDrAAl>roKDDrTKTriFLFRAFNU'«gLnnLDNPEnYLYY'/r:GPPLHN 

::uiLKLLGDVi^T.R:'.;;t rrx>DFf;r: 

'TfTi.OHM*! li)|)>Mf,K lOO'MlO'/ 

irr74t hyptiriiiT ic.ii putr.oiii 

t :rX.ML.';R r VTCFI .KLlJ :: U.l'LPAEEFywAO: :KhrrrA^l'AVMr^ I A I LK(--Yr fr.WH-EOKRR 

KAMKKH KfJDL/XKv'.PKVfAMi : [ ( ( rrvDL tp.miviuit/^A :k7 wi jci u\ I : IE I irf-NrNK.'; 

ygi:A'fHNA MiTliylr t .iiM.t.t -.u- 

A::t(;m:nTK^:iiiM:\ftiuk-:Hv::rr^\tj::LKKKEi:LUM.KAit.v(\':miAp 

:;LHr:ftNKHEF:;FFCrYh? :kK::|; :KI::;:ir IKK': I fVTI^ rLLIIiarrMIJt LKl.Tm-WWlJKIl 
f'Kf >CAYKPI -KNK* :: :i ; rrr.tVim *« :i w ,| IKMV f I .VVIU rt'l KyUVNFAt ■ IDfrWK K 1 1 is^ir.i . 
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N [A;:mrffc:t;rv/AARcc.Tnri'rrKLLVOAp.':t^>>' .-socNSA^F^LRpRsrroPOrTC 

AAK r : ETA»' F.F I N PECCETLICLYCC ACT :C 1 HLZ PYVKNV ICVE 1 1 POAVAS AOENI KA 
NNKECCVr;-/-LEOAKArCKRNDCKAPDVt £ £DPPRCCM0S>C/l.KVrLR:C3PKIVYISC 
NPKTOFCECADL 1 30GYR r KKMO P C DOFPYSTHLE?;! CLLEREI OP 

CPn,08B6 1011299 1019009 

nccA-Histone-Like Oev«jloc3raentdl Pr-acem 

frrLFMACK[rrAKKMKOr.LZSJ0HDl-,\KAEKrn^KAAAORVRTOSrKLacVAKLVRK^^ 



CPn_0887 IC11692 10U157 

CHLTR possible phospnoproceLn 

MKKLYHPTLFLRPLIRLSLI FALGLTL ISCNrPOQKSrCHCCADMHSALISCKNCEELFA 

DFIERVIADRETLTARa<rr^/VL\mEYLtJCCrRKGDCOVCVKIWm^^ 

LO r l^Rl^PE0APUlDVVtX:LFTrCCHE3L0DHU^EXrnmj*SCYE^mK0afl.t^ 

OGDVKKAIEtJUCELVAALEKCSCSPHPEIVOIOTFTOKTtXALOIKVAOEAOESCn^ 

TPYCLSEIA'n-EAMDALVLRIARGEVSRTNETOSVI^HALOHLPFAREKArPELEV^ 

HGAY^XSTtXYYAYFSIXELYH0^na5FASLERU:EKCDAVFVPEHPVFPEYGFTU^ 

AKGKYESAEKVFLOI IDPAVXLCy^TTARAYErLCXrAYVONHYEKAZEyFUlAYK^^ 

ESCICLrWYAV0KKKTACEI>ILYHPKFSFTYRHLUKLCSt^PHGE2OCGSSAI0IWHR 

AV PELSEIYSRCIYttI r KYRN^^YrHPI I ELAVH]VRNL£XRNLEEICRDA0DPEYI3KAL 

AFWCALOSCASVPRSLIESSDVDEARITIRCyEALYFHNPDAIAMLPOATSEECNSWOTA 

LRLWTLVRPKGAPWtAKYWDHLVUlPHGOSLyFFCYDLOEYLrCKEDAIJCHLSVFAJ^ 

PKSSr^LVYYl^SESSAUlKVCWn^KAI^EFTEISWSGEHHKTWAYIYYMVK^^ 

TYISL©*FS0AVHILEEVKEl>JOVASHPKLHFLKCEDCYLAMELRWN^IAYAyF0LH^ * 

AHLSNKLLEHVEKNLISPRSYROyYGESLORTLCLCORTLGV 

CPn_OB89 1015441 1014119 

hemG-procoporphyrinogen Oxidase 

AERRFCVKRAI I IGACISGLAAGWWLHKKFPOAEILVLDKEAVAOGFVRTESPOGFSrDL 

GPKGFLTRGDGEYTLKLIHELGWNSLIFSORAAWmFVYYRGKAHKISTWrUJUC^ 

SLIKDFRAPCYTODSSVDDFUCRHSSOtnT'SYIIiSPLITAIRAGHSSILSTHMArPELAK 

RfJVSSGSIJJlSYlJ<>mSPKKSKTDRVIJkSI^PSMCrrLlTTI0E3a. 

3 PKEACVrrPSCTFFAttiVrrTCPLOOLPVLLPNYG lEIJr^KRVLPWNl^SISIfi^ 

FStPKGYGMLFADELPLLGnWNSOIFPOATPCJaVLSLLIEGKWRESEAHAFAIAALSE 

YLNrPJOKPDAFALFSSQDCMPOHAVGFLERKERILPHLPCNLKnWilAGPCUJRCIAS 

AYHAICDLHTEETIAQPQSSL 

CPn_0889 1016341 1015462 

hemN-Coproporphyrinogen III CxidAsa 

Fli<rNVNrKrLEGIJ<OPAPRYTSYPTALEWEPSDAAPAIIJVFORIREN^ 

COSJCLYCGCSVVUIRREDIVEAVINTLIOEMKLVVrriGFRPOVSRIHFGGGTPSRLSR 

ELrrU^HIHKmSl^HAEEIAIEVDPRSLRNDMEKADFFONVCFNRVSlJCVOD^ 

OEAVRRROSHEESUCAYEKFKELArQSINIDLIYCLPKOTKESFSKTrODXLAMrPDRLA 

CFSFASVWIKPHOKAKKASDMPSMEEKrAXYSOSRHIXTKAGYOAIGMDHFSU»HDPLT 

LAFXNKn.IRNrOGYSLPPEEIJI^LC^STSFIRGmflNAJCTl£EVTOm^^ 

KSKILTEDDRIRKWAIHKIiCTrriNKEEFFmjUVEFDITFIESRDRLISMETrGLIKN 

SPGSLKVTPLGELFVRVIATAFTJHYFLNKVSKKECFSASI 

CPn_0890 1017829 1016819 

h«B£-Uroporphyrinogen Decarboxylase 

STUCWroSMSAFFDLUCSOTASHPPIWUJlQWRYMPPYOELKGSOSUCT^^ 
ATLWPSLLHVDAAILFADILSILDGFAVTYDFAPGPRIOFSPEOPFTFTSOPQTIFSYL 
LOAIRTLKOKLPVPLmAASPFTIACYLirxXWSKDFSmtSFLYVyPEKrDOLISriI 
BGTAIYIJC^0^ffiACAAAVOLFESSSUU.PSALFTRYV^EPNRRLr AKIJCEO^ PVSLFCR 
CFE^NFYTLOATOAOTLHPDYHVDUiRIOKNLMI^UXSJUJPAirLLPOEKLIiiYVEAFL 
VPLRTYPNF imSGHC I LPETPLENVQLWSYVQROL 

CPn_0891 1021079 1017819 

mfd-Transcrxpt ion -Repair Coupling 

NFMAMDFNPVm-DFSISKEnCErrLPLLLEWIHPGATAriJUKHFHDCRASVIMITTPAR 
LDDLFENUITFLDOAPVEFPSSEIDLSPKLVMIOAVCKRDHLLYSLNOHRAPIFCVT^ 
AU^EKTRSPOATSOOHl^LAVCDVLOPEATTELCKSUW'SOVMLTSEKGEFSCRGGrVDI 
FPLS3PEPFR r EFW:EKI IS IRSYNPSDOLSTGKVSKIS ISPAYTEEASGGMYSHSLLDY 
rSTPPLYLFDNLEILEDDFADI SCTLSSLPDRFFSICTLYDR I STSNOVYFSETPFPNVX 
NLKENRVI I EAFKRNMEASROA: PILYPEQI lONDENPLtAFUOHLOEYMPPHGKPLKLA 
:YSTKTTCSLKEARAlACTVARGDVEIYEKTCm.TSSFALVNEAFAAISLSErASTK\^^ 
QKORTHFSVTTEEVFVP I PGETWH I HNC IGKFLG I EKK PNHLNI ETDYLVLEYADKARL 
YVPSNtJAYLISRYVCTSDKAADLWHLNSSKWKRSRDLTEKSLIVYAEKLLOLEAORSTTP 
AFWPPHGESVI KFAETFPYErrPDOLKT IDOIYWI^IMSPKLKDRLICGDACrGKTEVIH 

I PRTIHHSLSCAROLSVI AMPPLDRLPVSTFWEHNTETLTAALRHELLRCGOAYV IHNR 
lESIYTLAETIRNLI PEARICVAHGQMCAEDLSNIFTKFKNQKTOI LVATALIENCIOIP 
NANTrLIOHADKFCMADLYOMKGRVCRWNKKAYaTLVPHLDRLSGPAAKRLAALNKOEY 
;XX»tK I ALHDLEIRCACNl U77D0SGH IGTICFm.YCKLUCKAVSALlOCHTS PLLFNDDV 
i^[Srr)^??I^?T^^^-'^"''^^f'^'3>«ICNA£SSEELTArOEE«RDRFCPLFOEIC^^ 



CPn_0892 



1023673 102104P 



•4US-Altinyl cRNA Syncnecase 

5«n2:!5J^i?!^^ir'^'^'^*'""^^^P^^f^P«^PsrLFTNACMw 

^ ^VU>K [ LRRi.-VtryCRR LCFRNPFLAE r VP^JLAPAMGEAYPELKNJ LCO IOK-/LTLEEES 
■IVFT^'/rAOVNRYRHKR l/\NNHT.V:HLLltK/\LEITLCDMIRO/V:i:WOarK tftLDFTHPO 

A I :; PF.DLU; [ ctlvnf:; t n ENErvo i r E.\LYcrv>tNr;nE i KorFc,DK\'ZD\r/?'/\/OPCHa 
iiKi/.-^y.-niAiivrf :iJt« :ffr rTKFJiAVAMU irr i E.\VTr;t: kakaivhoO'E'^a.eeiatllo 

VH<Uvr^::Ri;rATt.DEnKO'JDKRUlEt^:;LrorKLOKLinNOMtL':C'r'JL7HHL/\EHE 

tmui/jj/f^y I .iioit I i-f :kl c:;LvrrEKf* ;ky I vL:^Rvr:nnt.n\x :vnA0OLLKAVLTPCC 
' ;hwy;Ku^:.V ;::APALi'ATr:vtwETLwywi:rroLi 



4: 

EFUAFCLC I .-YSCCrr r ECL -M I NKKLC t LCK I ACAI ICIEG CJKAJ.'XSHPCL 
PLGCAEUVAYLYCTVLRONPRDPHW [ NRORr/UJAJltCSAlZ-Y JCliiLACFSVCLEDLCE 
FR0LHSRTP^HPB^CrrVtrf&\T:nirUWOLnKAVrw^^ 
YCLACOTCFKEnvSHEVCSFAaSLNUWbWIYDVlJWVV^^ 

WWYEIDCnfDFTHIHETFSS rKRa3ERr\XVlAHT r :CHCSPKE3rJ«AHCSPLCnTOTH 
ETKOFWHLPEEKFFVPPAVKNFFAHKrCEDRKAOEOWt^EVRVWSKOFPELHEEFVALTS 
HK LPKNLEJL^T;3VE;«P0S r ACRAAJNKL lOVLVOH r PYL ICa3A2I^S5DCTWIANEir.* 
t HTVT)FSGRN t K Yr;VP EFCMAT TMNnLAVSOVFRPFOCTFLVFSOYMRNA [RIJWU^KLP 

: 'vrr:::, • " - - ^ • • - ..-..::p;yp;^,..... ........ ^ .. 

• -m-' : :;• :v:.>--.- i-r: fTri."\ : • •: a:..-vake:-Eh 

LDKOVRW:;r PCWELKfcJXwL V :;VKw J ; \'^'JL^ I RV J : L^AL^JmiKi J JJBtXAiAMD 

RFCYSGASDovsEEccrrrEC i:-cr::-sc 

CPn_0894 1026823 1025889 

amn-AMP Nucleosidase 

PRNDKNAKNUWKMYKGERVSKHTSESRIAOWtt^RYSGSSVKOrCP^ 

FAiOiiCVPVFECSMFSAAHAPHUCTSIUSrKLCSPCAALTIOUrsrLPOUCAAXJi^^ 

GLRSHYOVCDYFVPVASIRCECrrSOAYrPPEVPAlJWrwOKATTEVI^KK^^ 

HTTNIRFWEFNKKFRKKLYETKAOSAQlECATLrAACYRIWLPISAlIilSOLPL^ 

KTKSSCNFIFNTYTEDHILTGOEVIEM.EKVHLKRAASDHKKDOOYJlCLPm 

MASGSETSOSDY 

CPru089S 1026973 1027557 

efp-Elongacion Factor P 

EIDCFKVRVSTSEniVCLR:EIDCOPYirWNDrWPCKCXJAF>raiKVKNn.Tt^ 

YXSCESVErADIVEjlSMRLLVT[X?EGATTKDDETrEOEVVFVEKI^IRCWLl£OTmL 

VLYNCpWAV^PPIFMEI^IACTAPGVRCOTASGRVl^ 

KVDTRTGSYESRVSK 

CPn_0896 1027574 1027822 

CT 7S3 hy pothetical protein 

EKYFFFTVRWlEAiaCIKELSKEAOUJaCLREKSRVIXEKNK^^ 
KEEKVETPQLFOAIAEKILEECV 

CPrv.0897 1028794 1027853 

( phospftohydro 1 as e ) 
NFSIiJSrnVIX3KNXSNPRPM0EKPRHV}miIHISOVHFH\rt.PVNPVHCFt^ 
FCLVHFOATTICORFPKVVRSUUDSVCITCOrSLTAMKErLL^KHFVETtJ^^ 

u»amInnrrLKSLAOQT^YTH^pNIX}Loo^ncvsFHKITDHVA^ 

VHLAOISAIEITLLSLSPEENVI lArWYPU^SONPSHDLrNWrHLCXiVLiOCYPICVRLYL 
HGHEHOAAVYNCAOTSPSYIUISCSISLPTKSRFHVrDLVPEKYOVjrmiLKNLLDFDAP 
IXZANEATWtXXKL 

CPtu0898 1030511 1028904 

Mitochondrial HSP60 Chaperonin Hosolog 

TKKRU3SVKriJlII/:vatfE0aCLSMYNADXKLFSCrDK^ 

KERCFyMSQTELSNSYnJlX;VDrAKAMVWCIHKEHSIXUTTOIU^ 

GISTHKLIASUOflGEKLOEALOQOSWPIKDAUCVRNIIFSSUWPTI^^ 

PEO^SITKSlENDKTSKDVFOCFXIPAGYASTYFVSOTASRLTRIAHPLIt^^ 

IHSIiPLMEISEONQKLIIFCEDIDPOVlATLVVjnCUXJU-aVTVW 

EDI AIJTCTICPCQ EASHVIAPOtVTIgSCLSI 

AEEl ttl-ISCt*. 1 luwL IKSTNRLOSSVA r LPTDEI»EPLYTIALK1KESAL^ 

VAUTASLXlXrrPKTOADENSIAISLWKACCAPUCli^TNAOt^nAVrAKL^^ 

LCISVFSREIEDLlAGGlLOSLArrSTIUOAUTTAILVLSSKILIUWOrEISTt 

CPru0899 1030848 103221S 

murP*Huramoyl-DAP Ligate 

^mRCCRONYHRAKU*EI>WSLKt^DVSCPKCDKKITCFAIDSQQVOPCDWTAI^^ 

OTOFtJWAATACAVAAWSHDnf<Xn)SrGLELIRVm^ 

GSVCKTTTKEFSKTri^SIYXTKASPKSYNS0LTWLSLI2!AEa)EDVMIl^^ 

M0DLLRIVOPEIAVITHINDOHAMHFP0GIOEILK£KSYIWKSKL0U.PKDSPrai^ 

SCSPTAEKFSFSFNDPLAOFCYKAISCOSWIOTPEENYCLPIAFSYKPAYTNLUAVAI* 

SWrtXVPEECVIRSU>EtJCLPPMRrEHSMRNCMCJVINI>AYKACPEA«IAALI)AI^l^ 

GKirLIUJHMAELCRYSEECHALVAEKAASRGDMrFFICEKWr PVOSVLKSYSCEVSFFS 

SAOOVXDILKQVARYCDVILLKCSRALALESLLACF 

CPn_0900 1032208 1033281 

inraY-Murajnoy I -Pentapept xde Transferase 

LVFNFLGASMI PLIPMFUCOSLFFSLALTCWITLVLWALCVPVMCWUCRXNYRDYIHKE 
YCEKLEMUiKDKAEVPTCCGVlXriSLIASU,VWLPWCKFSTWrriIU.TCYACLCWYDD 
RIKIKRKOCHCLKAKHKFMVOIAIAAFTLIALPYIYGSTEPLWrLKIPnttXaiLSLPFWL 
GKVFCLCl^VAIIGTSNAVNLTDCLDCLAAGTMSFAALCFIFVALRSSTIPIAQDVAYV 
LAALVCACICrumiGFPAOW>CDTGSIXLCGtXCSCAVMLRAECILWICGVFVAEW 
SVILOVLSCRLRKKRLFLCSPWHHYE-WLPETKIVKRFWIFSFVCACLCIAAVLWR 

CPn_090l 1033239 1034537 

fflurO-Muramoylalonine-Glutamace Ligase 

FCMRRSRYSGCLMEI DMCOR I L I LCTC ITCKSVARFLYOOGHYLICADNSLESLISVDKL 
HDRLLMGASEFPENI DLVI R5PC I KPYHPWVEOAVSLK I PVVTOIOVALICTPEFORYPSF 
CITCSNCKTTTTLFLTHLLNTLC I PA I AMCN ICLPI LDHMOOPCVRWEISSFOLATOEE 
H t PALSCSVFUfFSRNHU)YHRNU>AYFDAKUlIOKCLRODKTFWWEECSLCNSYOrYS 
EEI EEt U)KCOAUCP I YUiOROfrrCAAVALJWEVCWVSPECFLKA tRTTEKPAHRLEYLC 
KKDCVHY INDSKATTVTAVEKALMAVCKDVIV I LCGKOKOCOFPALASVCSOTnCHVIAM 
CECROT I WDAL5EK T PLTLSK0L0EAV3 1 ACT lAQECDrrVLLSPOCASFOOFOSntERCA 

CPn.O'tOJ iO.MS07 1035241 

nlpO'Hiir.imttttLie <inv.tsin re&^ar t.imtty) 

AVOOkN/V -^lEVNMNnR OMV tTAV/VNA I LLVALFVT.^Kft tOVKDYDEGFPNFASSKVTQA 
W::EEKV I EK rWAEVP.':R P r AKETLAAOFI ECKPV rVTTPP'/I»W::ETrEVmrAVPPO 
fVRETi/KEEOAPVATVWKKCDFLER I APANimVAKLMO [H&LTTTXJLK rOOVtKVfTS 
yDV,':HKKTrvtX'^rANI'FI/YY tVOBTTDnr-WTIALRNH IRLOOLLKKNOLLEYKARRUCmD 

IiM l^it..||V 
tr.t:Wr'»'lt l'ivi::iftn i*i;,f..irt K'-.W 

::Kor :i .':MKwrv i : :i i .t / : r :h i MVFr/r::;:A tvLDf'.';Ltr.Tn ikal i ("^VTYt. tu:u .v 
a:; ta.YMMFwnr)Kr.K r : : i vi .t - u ;aai j\l tr*/y r r<:u ; tcutg iahuma ivtjjt.r t oi*: :kivk 
Yi.vM VAi.YKi .Tf •::;:i.YfjK0LKMKi.Kt.rA r w r r- 1 LL tA I CT-ut ;::aa7 ( :;A:;t.i I VF IM 
•n;vfr f .itYWLi.i 'I .u :vi . t A/ •* :ai y^y I'MI yvmyh lnvyuipeu) t k/:iu:i K^f -voak tAA£'^x\ 
f.\.tr,rt :i\ -^v :i oKLTYLr-F^vni// ( aa ; yakki-'^^fw WLVLI iXYHirr/v? ;va i a i KA;i 
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•:Pn_00O^ 10 3^320 n3739'> 

murt-Peoc iJtKjlycan Transterace 

tBKViXLAVCCSCCH IVPALSVKEAFGRBCI DVLUjGKGLKNHPSLOOCISYIIE: 
PriCLPTVU^P I K IHCRrwUTCCYUCARKEIJC : rDPDLVrCFaSYHSLPVLLACI^HK : P 
LFtJtEONr;/rrKV?rr.rrRYAPnT':;VNF.':PVTKHFRCPAEE'/FLPKR.';F.';LCSPKMKRCT 

•:i!T't:;".v * I 7",At.vK:.vr;r---: •:?^vvwi; :v :; K.:LVVKV.jr;v:ir'';rv 
OVLEKrraiLEKELTEKIXVEmFALDSHNREKORNSIJOVVSOQRSTKTFHAFICECI. 

CPn_0905 1037400 1039835 

mucaiddlA-Murafliate-AU Ligase & O-AU-D-Alan Lipase 

VHYMKGTP0YHFICrCGrCMSAIJWILU3RGYEVSCSDLYESVTIESUCAW;ARCFSCM 

3SHVPH0AVVVYSSS lAPDNVFaTA lORSSRLLHRAELLSOLMBCYESILVSGSHCXTC 

TSSLIRAIFOEAOXDPSYAICGIJWercLNGYSGSSKIFVAEADESDGSLXHYTPRAWrT 

NI ra^UJhTf AGNU3NLVOVI0DFSRK\a'DLtnCVrifNC0CP I LXGNVQG ISVCYSPBCQL 

HIVSYNQKAWSHFSFTFLCOEYODIEUJlJCOHNAAhttAAACGVALTFGIDINIIRK^ 

KKFSGVHRRLERKNISESFU^EnYAHHPVEVAHTLRSVRDAVGUlRVIAIFOPHRFSRL 

EECLQTFPKAFOEADEN^LTOVYSAGESPRESllLSDLAEOIRKSSYVHCOfVPHGDrVD 

-aRNYIRIHIMrVSLCAGNIVT IGEJUJCDFNPKKLS ICLVCGGKSCEHOrSlXSAOHVS 

YISPEFYUVSYFIlrmOGU^RTCKDFPHLIEETOGDSPLSSEIASAIJVKVIXrLFPVI^ 

FCEOTriOGFFEILCKPYACPSI^LJVATAMDKU-TKRIASAVCrWWQPLrnr 

PELCIQNLrErFSFPHrVKTAHLGSSIGrFL\nMDKjE)Ei;:EKISEAFLYt7rD^^ 

SREIEVSCIGHSSSWCMAGPNERCCASGFIDYOEKYGFTOIIXrAKISFDLOLSOESLDC 

VRELAERVYRAM0CKC3ARIDFFlI)EBCNYWLSEVNPIPCKrAASPFL0AFVMACWTQEQ 

IVDHFIIDALHKFDKQOTIEOAFTICEODLVKR 

C?n_0906 1040514 1039915 

CT763 hypochecical protein 

KfeCSEVI^\^SOI^REASAFRnDIDFFILNrYPFFRNFKNIELCFFLS:SOn^LDFM£ 
EFVAYrVKNLVTNPEAVEIRSIEOEDNESIKLEIRVAAEDICKIIGRRGWriKALRTILR 
RVCSRIJCKKVOIDLVQPElKnWIAIX3DyrCDNDSSNSTEI7rFGESOTCCSGHCHYOm^ 
NQEEQCEGNKHHSCECSNHH 

. CPn_0907 1040816 1040445 

•cucA Periplasmic Divalent Cation Tolerance Protein Cut A (C- 
Type Cytochrome Biogenesis Protein) 

FAFSKFLIIKSSKTAVLILTSFPSEESARSlARHLrTERIASCVHVrPKGTSTlfLWBCKL 
CESEDiHIOIKSIOIRFSEICLAIOEFSCYEVPE^/LLFPIENGDPRYLNWLTILSYPEKP 
PLSD 

CPn_0908 1041607 1040780 

CT764 hypothetical protein 

IIJVILFMII XXNNELKIRRFFKTUTPCPOYSLCYASILIVl^SLVCVPTFG^J^ 
l^KFNPSPIRNLFLVSSTl^KWPTAIAEHLRLSADAPTYIiiEFSlKEAESSLHALCIFS 
SLVIEKSPDNKGITIFYTLQTPIAYVC^mS^frLCNL^3SCFUWPYFPSLNLTO 
. OU<M0KLPKEKKIJTKILLKEUWESPKIIDLSI^DAYPGEirVTI^SGSLL^ 
RAXiJLYKHMKKSPVIESEKQYVYDLRFPNFLLLKAL 

CPIU0909 1041592 104X966 

rsbV-Sigona Factor Regulator 

IISLirTRFLLnUI«NLSAICEYGDIIVIYtXX3SU>AVSVPSV0EYLE0FI0K^^ 

^RTOVSYISSMIR^XSNFKLV0SU3GK^CLCCVKESVTE^^ 

CLSKL 

CPn_0910 1041970 1043004 

miaA-tRNA Pyrophosphate Transferase 

FLYMLPFEFEFtriTSSPECD^riDPOKLFVKLFKRTIVLI^GPTGSGKTIW-SLAIAW^ 

GEIVS^roSM0VY0GKDIGTAKVSLKAR0EIPHHLIDIRHV0EP^^IVVDFYYEA10ACQNI 

L3RNKVPIL\roCSGFVTHAFt^GPPKGPAADP0IRE0IXAIAEEHGVSALYEDLLLKDPE 

YAQTITKliOKNKIIRCLEI lOLTCKKVSDHEWDrVPKASREYCCRAWFLSPETEFLKNNI 

QMRCEAMLOEGI^EEVRGI^NOCrRENPSAFKAIGYREWrEFLD^ttEKLEEYEETKRK^^ 

SNSWHYTKKQKTWFKRYSIFRELPTLGLSSDAIAOKIAKDYLLYS 

CPn_0911 1044079 1042985 

Fe-S cluster oxidoreduccase 

SLLLAI FNVNYFWNLCKRI SFEBCLELFVSSP I ERLOERADAIRKERYPSNEVTYVLDAN 
PhrrnJICKI DCTFCAFYRKPKSPDAYU^FDEVRSLWRYVSSCVKTVLLOCWWPGLG I 
OYLEELVRITVOEFPS IHPHFFSAVEI EHACRVSGI 3 lEQGLORLWDACORT I PCGCAEI 
t^ERVRKIISPKKMCPGGWINt^IOJVHLMCFRrrATMMFCHVENPEDILIHUTrLJl^ 
SCPCr/SFI PWSYKPCNTALRRNVPOOAS lETYYRILALGRIFLDNFDKVAASWFGEGKS 
LJ3AKALHYGAC0FCGVILDESVHKATGWS rOSSEEEICN I IRSEGF IPVERNTFYOH ISC 
TVSSL 

CPn.0912 1044120 1045760 

CT768 hypocnetical protein 

WIMDNSDNSFHTLETECXJSFLNDELAVEEVASTESTEISDATLCFAEKKVAFILNKMRE 
ALTCSS0C3DUlLrWDLRK0CLPLFNEIEDTAKRADHWRCYIELTKEGRHLKCL0D£ECS 
FWCQIDLAITCLEKOILKFOECTEDKIFKDREDNFLESOALDKHOAFYKOHHTSLLWLS 
.SFSSKIIDLRKELINVCHRMRLKSKFrORLSNU:NOVFPKRKELIEKVSOTFAEDVDAFV 
AKYFIGSDKETLKKTVFFLRKEIKNLOHAAKRLP/SSKVFAETRLKLSKCWDOUCCMEKE 
f R0E0CRLRWSAENSKEVR0MLAEVSSLLIECNDL2KVRKDLEC I SKK I RALDLTHDDV 
I nUCKEM(>JLFDOLR£K0DAAEHSyO£0LA^K0VKKEAARStJ\En tTTFSKTCSECMlT 
rfESREEWTTLKELLCKMSFLPPPEK ISLONOLNLALOTI VNFFEEOLLSS PDSREKLVNM 
RGVLK0RRERR0EUCDKLE0DKKLLCSSGLOF0RAMOV3ALVEEDKRALEELDASILELK 
OOtOOLL 

i Pn.OOl i 101570'! 104S*J4C' 

Nil iLbiiiir ^,.)tItl^lotJ present in i^nt»L\»nK/EMBL ^ir: ot ll/7/'»»* 

U (. M-KYPR ( FJ\TD:;At AMRRNC tYAFOUXTTLLKaC'^rf.'^FYCYf XUXILFriYKTLPPt: I 

YKKFMFKFFF'MFlirSnR 

'*lii_0'»M lll.JS'iO'i 

t*i tnliii:;t htmiitltxj |/Ci.i:t;nc in tU-nelMnk/EMOL .rj or ll/7/*iK 
VFFVnjLF\';FYY::rVTr<I.lj::;vrcDDLYEVAtWFlvrr:.T':::DFYAfVLEKLEEAFADriv:o 

vttrr:::;:rnFrvinMAWti:i:;:wA:u:YRC/j::At7jTiYKKCLT^:oKKAv;i l;:y ikk ino 
AH:;iiTr::nMn.rji.rKLMu;EEKTwnrx\:(<OKKR\/KY'atfNtv 

• >li_'»'M', I li t (mJ 01 ttH».Hl" 
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ytwB-toi.iO iiut>"rt.Mnt. c^iiolcvj 

FMLKKSSTL-Wt [E'^PKJ\CFMDJFrF:LLKVAAXA:DDKKC;NNL;VLC\.*TlT:3EFTeYr/ 

KVBCSvNVHVKAtiwrrr/Eaj^KOKvapwiyEC nroff^Kv^oY;;?? ^wwivsEr^KY 

RLEELWKOCFrVTUKtiA:; " • " 

CPn 00 In I046fll3 1048084 

CJbF-Acvl Carrier Protein Syntnase 

LU*7J^^VYMSKKRV•^vTCFr^^^<;c:-3NEVTTFY0^^-UACVSCVRPTTSFPCEDYATR^AC 

n --r"-"'. ' - - •-■•^v•v••"'A: ^v.-: >:-'v*;=:.; /rvrir-K.; 

- 'H'-r-; • ■ ■•■* ■ = ' - :-'r:**-\T.\: ■ >y- :-■ " >^ :; ':v — - 'AT^rr*: 

DAAYOHLVboRADMZ :*G;*T;:AA\*"NR : JL£JFiANKAUjERNDA(-uvMaRH/2R0RDCFV 
LGBCAC r LVLETLE3ALRRDAP I FAEMLCSYVTCDAFH ITAPRDDCBCITACVLGAtNSA 
C tPKERVNYVNAHGTSTPLCCOS ENnj^VKKAFCSHVRNLRMNSTKSLIGKCLCAAOCVEA 
WArOArLTCKUHPTirn-DNPIAEIEOFOWANKAODWDIOVAMSNSFCFCOTlS^ 
RYVP 

CPn_0917 1048054 104S539 

hydrolase /phosphatase noraolog 

FNOIlLEVCTLVJWKTKYEYSFCVIPIKFPGTPDKMTUCACriCHTRGKHWCrPKGHOT 

KEGPOEAAERELVETKXSVVNFFPKVLIEOYSFNNEEOVFVRXEVTmAEV^ 

PMEICDSOWLSWEGLRLLSFPELRDLTVEADKFINNYLFSS 

CPn.0918 1049232 1048579 

ppa-Inorganic Pyropnospnatase 

EU/lSKKPLYVAHPWHSPTLTCDNYESUrCYrEITPYDSVKFEUJKATCLlKVDRPOKrS 
NFCPCLYCtXPCTYCGTASGNYSGECTRRECIOGDKDPLDVCVLTEKNIHHCNItiOA^ 
IGCLRI IDSGEADDKI lAVLEDDLVFAErEDISOCPGTVLDMIQHYFLTYKATPNHLriCC 
SPAKIEIVG lYGKKEAOKVIOLAHEDYLSYIGDTAEVN 

CPn.0919 1049375 1050430 

Idh-Leucine Dehydrogenase 

FWCYSLNFKEIKIDDYERVIEVTCSKVRLHAX lArHOTAVGPALGGVRASLYSSFEnACT 
DALRLARGKrYKAI ISWTGTOGGKSVI ILPODAPSLTEDMLRAFaOAVKALKTIYICAED 
LGVSmOIS tVAETTPYVCGI ADVSGDPS lYTAHGCFLC IKETAKYLMQSSSLRCKKIAl 
QGIC»VCRRUjOSIJFECAELVVADVLERAVODAARl.YCATrVPTEEIHALECOirSPCA 
RGNVIRmJLADLNCKAIVCVANNOLEDSSAGMMLHERC ILYGPOYLVNAOCUWAAAI 
BGRWAPKEVIXKVEELPIVLSKLYNQSKTTGKDLVALSDSrraXUArrS 

CPn_092O 1051423 1050431 

cysO*Sulflte Synthesis/biphosphate phosphatase 

IIXDiShWSEI^NYONIVESVVTEITTOU-NYRSEHRLVPFWEKSDGSFITAAOlfGSQYY 

tJCQOlAKAFPNIPFrGEETLYPDODNnClPErLKrnUiTSSVSRDOLISTLVPPPSP^^ 

LFW.VDPrDGTAGFIRHRAFAVAISLIYEYRPILSVMACPAYNOTFKLYSAAKCHCLSIV 

HSQNLCRRFVYADRKQTKOFCEASLAALtWHHATWa^LGLPNTPSPRRVESCWfALV 

A£GAVDFFIRYPFIDSPARAWDHVK;AFLV£EAC»RVTZ}ALC;APLEYRK£SLVU^^ 

LA5GDQCTH£rrLAAL0NQLN\A^OKLIAL 

CPa-0921 1051526 1052293 

snGlycerol-3-P Acyltransf erase 

GEI>(LIKLWW^TYBCKYITLVCAIiKIJtYRMQVBCWI7rLNINPKQCCl^ 

ILEYLFWSRFHVKPMAVEVLFHSRWOWFLNSVRSIPIPOLVPCKESKRSlXWflWCra 

ASRAtinu;£SIIXYP5GRLSRTCK££IV^OYSAYVU^RV^IECNVVLV^ 

YKQMSTPKLGPAFKEAFRAUJlRCirFMPKRFVXITLCOVDKLFIJCQFmODU^^ 

WFDQCDDNLPIEVPYA 

CPn_0922 1052266 1053927 

aat-Acylglycerophosphoethanolamine Acyltransf erase 

OFAKRSSIillTRKUWHHOQRNRGHNNKNIJlUlPGSTUXArLILCSEHEECZACFO^ 

GSLSYRELRNAI lAVAIKVSKFSEDRVGVMMPAS IGAFIAYFGILLACKTPVWWWSOCL 

REUU«:TKTVEVRRVLTSOOriKHLTEMX;FVEYPFDll<VMECVR^^ 

KCSVPWUAIFGVSGVESOOTAVIUTSCrrEKLPKAVPLTHKNUtEWJEACUCFFDPr^ 

DWtJU^LPPFHAYGFNSCGLrPLLMGVHVVFASNPLNPKKL\^IOD)CK\^^ 

DYIUCTAKKQMSCIJKUU^WIGGCAUC£3TLYErrKKL0PQIALYOCYGAT 

TKESPRKSECVGMPIEGMDVLI ISKETHIPVSSGEOGLIWRCNSVrSCYUCNHEHOSFV 

SLOGDOWn.TCDLGHIGPSGDLFLEGRLSRFVKIGGE>fVSLEALESILHEHFTENONEDA 

GSLVVCCIPGDK\mJXtrrTIATTIHEVNDILKSAETSSIVKISYVHOVESIPILCICKP 

OYVSLNALAVSLFG 

CPn_0923 1053966 1055093 

bioF.l-Oxononanoate Synthase.l 

VCKESFLTTSDVrDFVTNDFLCFARSPTIYCEVSKRFOIHCOOFPHEKLGIRGSRLMVGP 
SSVTDDLESKIASYHGAPNAFIVNSCYMAtn^irHHVSRSTDVLLWDEEVHMS\WHSLSA 
ISGOHHTFHHNNLEHLESLLOCYnrSSKGRIFIFVSSVYSFRCTtAPLEOriACSKlCYHA 
HLIVDEAHAMGrFGDDGKGLCHA^JGYENFYA^^-V^YCKALGTMGASlXTSSEV!CYDU<^N 
SPPLRYSTSLSPHTLISIGTAYDFLASECEIARKOVFKLKEHFHECFDSHAPCCVOPIFL 
PHTCLEEAI SVLETTG IHVCWAFAKHPFLRVNLHAYNTVDEVNLLAOVMKPYLEKSSHR 
^/HINHEFHLWRELCCH 

':Pn_0024 1057301 1055028 

priA-Primosoroal Protein N* 

KRFTAKTKSMGY t ES5TFRLYAEV tVCSN INKVU3YCVPENLEHITKCTAVTI3UIOCKK 
•/WIYO IKTTTOCKK I LP r LCLSDSE I VLPOOUXILLFWI SOYYFAPLGKTLKLFLPAIS 
StWlOPKQH YRWLKCSKAKTKEI LAKLEVLH PSOGAVLK I LLQHASPPGLSSLMETAKV 
30SP tHSLEKLC I LO I VDAAOLELOEDLLTFFPPAPKDLHPEQOSAIDKIFSSUCrSQFH 
THIXPG ITGSCKTEI Vt-RATSEAI^CXIKST I LLVPE lALr/OTVSLFKARFCKDVCVLHH 
KLGOSDKSRTWROA^ECSLRILICPRSALFCPHKNLGLirVDEEHDPAYKOTESPPCYHA 
P0VAVMPGKLAHATVVU;r:ATPr:LESYTNALSCKYVLSRLS3RAAAAHPAKrSLrNMNLE 
F.EK.TKTKILFCOPVLKK CAERLEVCEO'/L IFFNRRCYHTrn/SCTVCKHTLKCPHCOMVLT 
FHKYAfr/UXIit/TN:; J PKDLry.':CPKCLGTMTtJOYRGGCTEK I EKI LCO rrrO tRTIBID 
."OTTKFraMETLLKOFA ^;K*A0VL I CTCM [ AKr^FSA^/TLAV I UJGDrXILY I PDFRAS 
EOVFOL ITf.iVAf ;R:**:R::iir.|X :E ( L 10::Fr.rnMFT r HSAMPODYrJAPVrrOErTCRELCEYP 
f'F IRL I r-O £ FMCJK(TKVrWKKAMI<VMM tLKa>t-£:rrNPLME'-/TPCCHFK IKC3TFRY0FLI 
/HAW r PVNKKUIHALMU\KL;;i'KVKFM tUVOltrrrFF 

* rr/Y' t riy{»c ir ht rr t . 1 1 I M . •> •> t ti 

KHWLF«aif;ONFiiDTi J :oM r/h Y::EF:r.Y(''ri A::i.LNVTLff (TA t::A.'r/::::f PEKAVEVPN 

AEPUt'ITPPPt^m^VKKTKI-::t**KiV('rJirOL:;0NAlLKEKYPALK£/:::UAI'KfFX:Sl 
r/YKfllf IEEVI.FFNKtrf\K I i:V'.^jl.i'\ -fK: .Tt. r I lAKTN I ryUMPNFFLAt JVrUf/t RYK tP 

•rpDYHO.'XToro: i ki .i-uy::. .kyek (//jjixmiima i uip.LPFAYTir:;.: 
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Thinrcinxin Oisul fid's [sfxnernse 

CHHTTOTYITTPTFKDSOMKfVLOGCAFVrXLXXTLPCCAARRRASGENLOOT 
WESYAEALEHSKOOHKP tCLFFTC3D»OWC IKMOtXJI U^SS£^KHFAC7a^LKKVE^^ 
POKhmOPEECP-OKNOELIOVOYK^nCFPELVFtDAECKOt>Jt*CFEPGOGAAWSKVKSAL 
KLR 

' iii.: . ; - • t .,-if. ■'■ »>■ . 

..■ :^;:.:r:•:'::;^•^--:v^:-:..r:.::.iHF::.:Kr:-:A.:•^^i^^^^ rrrvrrirvK 

F 1 13 1 1 LrLPLALLWVLKiCTCOrF r LPSS I ISOSMSKTAVAI RRKTFLSH IKQLXSLKEI 
SAAORWIOYDDLWDSLA IK I PHALPHRWI LYSOCNSGLMENLFORCDSSLHOLAXATG 
SNLLVTNYPG IMSSKGEykKRENLVKSYOACVRYUUJEETCPKAWI lAFCVSLGTSVOAA 
ALDREVTtXJSDGTSWI WKDRCPRSLACVAW ICKP I ASAI IKLVGWNI DSVKPSERLRC 
PS r FIYNSNHDOELISreLFERENCVATPrLELPE\nCrSCrrKIPI PERDtXKLNPLSPNV 

VDRIAAVISNYLOSENRKSOQPO 

CPn_0923 10S1035 10S9B84 

•CHLPS 43 icDa procem horoolog_3 

RRKDFAFTLLNLSNRSDIl^IFSNPHPVSYFSSTHAKOLSDrSKKHPILTKrVTIIVKI 

FKLLIGLIIPPLCIYWtrOLVCSLAI^PRSSMLYSVLrrCrrKYRLBOEIQtrmnC^ 

SFKDPAVSESKRITIOODHLTIDTLAIHFSTARPKRWLLISLCSCDFLEDMIGLKDSLFL 

SWKELAKLLGAN I LrmPGVKSSTCKUaLEIJLATAHNLCAiCnflDKICXJPGANEII^ 

YSUX^n/QSAALOKNPFTNSETSWVAVKDRAPHSLPAAANSFFGPICKLIAVLARWKKDA 

EJCNSRELPCPEILVYSADRFRPSE^rtlDDTAIiPErriJWAIKKrPFARSKKriGEVNLL^ 

5SPLKHPTI0KLAEAILE5L5RKN 

CPn,0929 1062301 10611B6 

•CHLPS 43 kDa procein homoloff_4 

EKFMAP IHCSTlAFVnJILHSHPSPQATYFSSTRAOKLHEFKDRHPVLTRIASVI IKIFKV 

LIGLIILPLGIYWLCOTLCTOSILPSKNliKlFKKOPWrmJCrWLHAW 

SMRRWIWJDNVLIDTtXICLSOAPTNRWaiStiCSKStXtrACKEIFOSW 

ANILVYWPCVMSSTGSSSUOSIJ^AKNICTRyLKDKBOGPGAKEimCTfS 

AUUX3KIVA^flJP^VIAVKDRCPLFISPEGFHSCRRICKLV^R^^C>CTKA 

LEIFLyPTDSUU^ST^mO^^CIXAPELTlAHAIKNSPYVONKEFIEVR^ 

VALATPILKXLS 

CPn_0930 1062851 1063330 

No robust homolog present in GenebanJc/ML as of 11/7/98 
NKKSElJ^STCLOMVPHTOVHHAUmWVILTIAACt^LIAGIVL\^^ 
VIGGMILILFSSIALIYLYKKTREVTWIAIXPLPEMISKTOSriDrmRDYASLEKKAT 
FAVnfTHYYreSW^REIPRFTlLCSYLAIJWOMDROALF 

CPn_0931 1064078 1065718 

lysS-Lysyl tRNA Synchecase 

IDFRVLCWXSDirrmiXERKTARAEYIJJHEDFLYRSHKLOEl^HlJtWVLYPYEFTGV^ 

C£DXKKTFA5QELSNSEAAMSRSTPRVRFACnU.VLFRAMGXNAFS0tLOKNQTXQVH^ 

ErrSVHCLSEDAEmiKFIEKKlI)U3DILGirxrfLFFTHSCELTVLVCTVTL^^ 

LPDKHAGI^DKEVRYRKRWLDLISSREVSITrFVKRSYIIKLIRNYHnAHGFtXVErPIM 

NIYOGAEAKPmniEAUlSEMFLRISIXIAIJaCILVCGAPRrYELGKVrRNSCrDR^ 

PEFTMIEAYAAyHDVKEW\m/ENLVEHLVBAVNHDNTSLVYSYWKHGPOEVDrK^^ 

HTKKES lATYAG rCn/DVHSDO)aKEIIJCKiarrrPETArATASRCK[.rAAI^ELVSI^ 

APHHITDHPVETTPtrmJlSGDTAFVERFESFCIjCmrNAYSEtJroPIROREL^^ 

TIa^EII^DSECHPIDEE^LEALCOTlPPAGGFGIGVDRLVMILTNAASIRDVLy^PV^G^R 

FDAEKTN 

CPn_0932 1067160 1065721 

cysS-Cysceinyl tRNA Synthetase 

VKSDTVMAFSHI ECLYFYTn'ASOKKEUTPNHTPVnU.YTCCPTVYDY 
lUCRTLVFFGYSVTHVMNITDVEDmAGASKKNIPLOEYTOPVTEAFFCTUTT^ 
DFYPHATHY I POMrOAITlaXEOCIAYIGQOASVYFSLNRFP^m^a^HU)I^UlCCSR 
ISADEYDKENPSDFVLWCAYNPERK^VIYWESPFtyCGRPGWKIJCSIKAMEUXaJSLDIH 
ACGVCNIFPKHENEIAOSEAI^CKFFARYWUfSEHLLICGKXMSKSUSiFLTLRDI^^ 
FTGOEVRYMLLOSHYRTOLNTrEEALLACRHAIJUajCDFVSRIXS^ 
SSOFIEAFSRALAJTOtWSTCFAStJ'DFVHElNTLrDOGHFSKADSLYIIJrrLiaaaJTVL 
GVLPLrrS\A: I PETVMQLVAEREEARiCTKNWAMADTUlDEILAAGrLVEDSKSGPKW 

CPn.0933 1067532 1068578 

predicted disulfide tscnd isomerase 

PVIIXONI KRCSLKOUCVlJ^TLa^LSLPTLEAA£inU)SDS IVWHLDYOEALOKSKEAEL 
PlXVIFSCSDW^«;pCMKIRKE^^ESPEFIKRV0GK^Vr^/EVEYlJCHRPOVENIR0ONIA^ 
KSKPKrNEl.PCMILLSHEERErYRICSFCNETGSNLCDSLCHIVESDStXRRAFPMKrSL 
3l^EL0RYYRIAEELSHKEFLKHAIXLCVRSDDYFFL3EXFRLLVEVCKMDSEECORIKK 
ftLL^«DPKNEKOTHFTVALI EFOELWCRSRACVRODA30VIAPLESY XSOFCQOOKDNLW 
R'/E*«lAOrk-U)SDQWHHALOHAEVAFEAAPNEVRSHISRSLEYIRHOS 

CPn_0934 1066948 1068526 

rnpA-Ribonuc lease P Protein Component 

YFVHPLTLPKOSRVLKRKOFLY TTRSCFCCRGSOATrr/VPSRHPCTCRHG ITVSKKPCK 
AHERNSFKR WR EVFRHVRHOLPNCO I WFPKCHKORF'VFSKLLODF INC I PECLHRUJK 
TKATTOCECTPKSEKCVTAPR 

rPn_0935 I0c?l00 1068957 

CL34-L34 RibosotnaX Protein 

E0TVKP.TY0PSKRKRRN5VCFRTRMATRNCRKLLNRPPRHGRHSLVDL 

'.'Hn.Ovj'* U)n'?310 1069470 

tWt-Li't Ritouomjl Protein 

Y: JlKVr;:;: a'KA&P:;KoDKLVRRKCRLYV:NKKDPNP KOROACPARKK 
:i*it_l»'H/ lOi^'i.TH? lOti*»7MS 

t.:l1-:;M Ki^xnuYtkil ProcHin 

7Kr<KAKK::::VAUrVUCRRRLVG\NFKKR::DLRKtWnL-Vr:EEEKENARI:;LNKMKRDTSP 

Ti'lJiNi" 1 .1 ;p ;rM'i<',;viJ(Ki*A i :;r tt:FR(.>MAUMi^E t ;v i ka:.-w 

t'r/im r»yfH<t ic i i. Ml nor. .-in -|l>."iilnr ".'u r>*-n> i<l<* i"*^!^ iP^^:'''"^ct 
■./rrNK*/rri.in'Mt.i'T::it.c.iwru;£:u;AV[AOKKKr'rr/rfWF^Ai :akf»'fi'':lwu-ll 



cpn.09J'» n* • i07u-:is 

CT790 tiypotnec ical protein 

HINRVfftRl^tTW'pjrVLTrrSBEIELtCpCKMDCOlff^JJLOVKCT 
VIOVt X UlCLAKIICVSLLCZINLinftLFCRD t ERHKCrYVEODSKJf^^ 

VS rPEICTEEIOCCrVSEISrrrCLHVAAVWI ikgltopkdr ideeieeevsvcdlpspe 
DFLLENSBC 

rPn.fto4n io7im? I07i204 

■ .vr * t-U ; I MN- ■ : • I • -i::-::::* 

•rvrr-r ::.VMP:rr=- -:■7>"-•:y^■- v: "'x :: .•::-:.\.--!tEK-r;;? 

ER IPFLMKKTAS rCTXWSHETEAULLENNLIKCHHPKYNVXUCOOKTFFCLAlSLSKSW 

PKVEAI RTKMTSSOROLI rCPYVS AEACHTLLEVI SOWFPLRTCSIWEFALRKRPCILY 

DMWCLAPCVCYCTPEEYgCTLOKAILFUCCKIEEVVKDLEKVIOKASD^^ 

RTt^IKOAKAK(»VEKFHrONinALGLYRHKORTILTLLTVRSCnCL^^ 

EIXJDU^SFrUIYYVSOPYIPKEILTPLPLEFPTLSYVUiAESPPRlJlSPKTCYCKELLD 

tAYRNAKAYAArPLPSSTLPYODPONILRMSOYPYRIECYWAHMOCyVHATCVYlVFDW 

GroPKOYRTFSIOSEKT0NDLA^XEEVLLRRFHSLTTALPmI^AaXXa^^^^ 

TLNLTCIQVVTIAKEKSNHSRCLNICEKirCETFPEX:FSLPFTSNTXOFrOIUU)E^^ 

ISnflUOOlGIMLFEOEKIPGIGCVKRKRLtAKFKSWKQVMt^SQEEL^ 

LUUtOXDFNKSD 

CPn_0941 1075504 1073018 

mucS-CNA Hismacch Repair 

VKfEJ tKWP WEC>fflQCK£KACDSVLLnWCn?FYEAFYDDAVU^QHL£LT^ 

SGIPVSTVDTYVDIU.ICKCFKVAVAE0FGEPAKElCESKKIGPKAROI0RrVTPGTLLSCT 

L tyQ FCTWWn/ArNRICSLFGFACIJDLSTGSFFIEECamCELVDEICRLAPSE^^ 

FYNWSTAIVMOLOOHUCLTLSTYAIWAFEHKFASOKLTTHFOVASLaSFCU^^ 

ACCai^IODKLtiPTKHIAIPQTRCKOOKIXIOTASQVN^^ 

DirrSTPMOGWIJlOILISPFYNPKEILVRQnAVEFriJ«3VrLW^ 

TKVTICLACProiGTtJttJSrSAGAOIYEOLASATLPEfTrDKCSLSTK^ 

fCDl^UiVSDCWIFVDEFHKDLKRlJUlNOEHSOEWIWEYOERIRKETCIK^ 

GYyiTCSErAPOtJK0riRTOSWJlAERFTriElOOFODI»OJISE3aOT*ETOrrra 

CSHILOLRTEILALSOSLWJL0YI IStADIJWAOGYCRPHVDMSDTLCI YRGCKIVArrL 

VUTOCFIPSITrEHRCSGTRHXLLTGPmACKSTYIROXAIXVXW 

IDIttrrRICy«n»aJKCWSTFHVE«AETANILHNATDRSLVIU)EVC^^ 

AVVE«IJTDKKKAKnXATHYKELTTLEDHCPHVENFH^ 

OKSPOIHVAWJ^FPLCWSRAOQIUWI^PESXTRPAQDK^ 

CPn^0942 1075955 1077754 

dnaG/prlH-DNA Primase 

fCSXTKUCrAKYTEESLOOJWSXDIVDVI^EHXHLKRSGATYl^ 
PACaHYHCFGCCAHGDAXCFLM0HLSYSrrEAILVLSKKFC3VDLVIX)PKDSCYTPPQ^ 
EEUWXNSEAETFFRYCtYHI^EARHALOYLYKIUSPSPtTrXDRFHLGnrGPBOSLniO^ 
ERXISQEOLKTACFFSNIWFXJ'AIUIX XFPVHXULCKTXCFSARXFLElISOOCICYWrFEr 
PXFiaCSRXIJaJIF5RllRXAXEiaC\aLVEC0ADCLQMXO5GFNCWAA^ 

XAU^SQDYLTFLXSEICMSSYPlCFGPREXAtXVCEAXROXiaMCSPILVYEHUCO^ 
IO(VPE2»(VLSt>NPQVTAepQNXPXK0K\^KXKPHXVKETDILRQ{I^^ 
FVPVPEDFWPECRIOJ'AFHXSYYaCYRKNVPrDEACaVLSDSOXLOLLTKRRXlffEM^ 
TXFVQSIAKKADRRWReQCRPLSUC»'IQDKICIXXL£DTV^ 

CPIU0943 1077972 1078238 

CT794.1 hypothetical protein 

PFMK5FKFIJLPFl^XUXCNLX^SPRSRAI5VT£SXGMSAVKTLVUEXAK£F^^ 
GVGASSXLRCMQrrOWXXESLLAQHEVH 

CPru0944 1078503 1078997 

No robust hosolo9 pr*s«nc in Genebank/EMBL as oC 11/7/98 
IKtMKHRyFXPLtJ^Xr5P5t.VRAEU}PSEKRXa;WFTQL^C:AECS0l.FX:iCFM 
lESCaPCILVrFSERPTPEFADLTtCSFSLSTPIAKGFNVVVLCPCLISPLDFrHICMDPV 
XLYMGSFLQIFPEVCAVSGPRLCYXLXDEOOCAOCQAVLPLETKN 

CPn^0945 1079001 1079660 

CT795 hypothetical procein 

5 X FlOff XLPSYFCHNF-DOLRRKYKRt ALSU^LLMXFP I PGEESRPGSEOCNSNTOCXVC 
SODT0ra»YHSYE0CL0ASRXECKPLVl\AnXNSGDDOOACTICI^roEEVLSV^^ 
FSEtJWFWLVPSGVNPLXYPP lEDP ILAEIVKFKELFKDESFPTCLSI IWCVTPEGPG 
. DIIEVSPVSLTVEEEETLPSEOTrEVESTSELOSEDPAXA 

CPn^0946 1082816 1079745 

glyO-Clycyl tRMA Synthetase 

GECOKKKCYTLESFVSEHPLTLOSMIATILRFWSEOGCVIHOGYOLEVCACTFNPATFLR 
ALGPEPYKAAYVEPSRRPODCRYCVHPNRLOMYHOLOVIUCPVPENFLSLYTESLRAIGL 
DLROHOIRFIHDDWDiPTIGAWCLCWE^WJGMEXTOLTYFOAIGSKPLDTISCEXTYCI 
ERIAWLOKKISIYOVOWNITrLTYCOITOASEKAWSEYNFDYAhfrEMWFKHFEDFA^^ 
RTLKNCLSVPAYDrVIKASHAFNILDARCTISVTERTRYIARIROLTRLVADSYVEWRAS 
UrrPU^LSSTSEPKCrSESV\^rSSTEDLLLE ICSEELPATFVP IG lOOLESlJWTVL 
TDHNIWBCLEVU;SPRRLALL\n<NVAPEVVOKAFEKKGPHLTSLFSP[X;DVSPOCQOFF 
ASOC\ro I SHYOOLSRHASUVIRTVNCSEYLFLLHPEXRLRTADIUtOELPtL rORMKFPK 
KMVWDN SGVEYAR P I RWtVALYCEH Z LP ZTLGT 1 1 ASRNSFGHRQLDPRKIS XSSPODYV 
CTLROACVWSOKERRMX I eC3GLRAHS50TISAI PLPRL XEEATFL5EHPFVSCC0FSEQ 
FCAtJ»K£rXXAEMVNHOKYFPTHETSSCAXSNFFIVVCDNSPNDTXIECNEXALTPRLTt) 
CEFlJTCODUrrPLTPFXEKLKSVTYFEALCSLYDKVERLKAHORVFSTFSSLAASEOlJJX 
AXOYCKAOLVSAVVNEFPELOCIMGEr/LKHANLPTASAVAVGEHLRHITMCQKLSTICT 
U^LI^RlXNtXACFXLGLKrrSSKDP*^AIJUlOSLEVLTtV.';ASRLPXOLA:>LLXMUJU}H 
FP<rr t EEKVWDKSKTXHEI LEF XWGRLKTFMGSLEFRKDEI AAVt, I D5ATKNP (EI LZ3TA 
EALOLLKEEHTEKLAVITTTHNRLKKl LSSXJCLSfTTrSSPX EVtjCDRESNFKOVLOAFPCF 
PKErrAHAFLEYFLl'LADU^XOOFUfTVII t ANDDGAIRMLR ISLLLTAMOKFSLCKWE 
CVAV 

':PtuO'i47 IOHMM l')H40'j*t 

pqsA 'UyciTol i I* Phcmpharyh/ Ir. iMrtiUt^ciso 

nRvnLPhrY iTFr;Rt.F irr r m ilylk' ;KWfir: rrr-wr .i'T/r.LAia-\ i r;Ei.TnA iixwa 
RKFrx^'/TDuiKtLDi-K/M):.: I YH i:: rYLTnvr*i*vi«.rLLL*/F I Fi-\nt>:;v[;rrumvAF 
i^TfiWAARA: a :k I.KA ( I :v: :Ki-r . r t.L'/M 1 1 1 1: : i / ;i .u x»rx :le i ka: vrv:: 1 1 avy:: t a:; 
I E'^FV/MNKNKU';c»HAKTKt '•.:mmn':KU 

•:Hl_tr'4H H»S».4»il I**M41M7 

•/MA ';i Vi:i*M.*ii :;viiri».ii:c 

';E:Wh t WVAVKiTi' 1 VKW u :i i :iiAV/*: :ki:i ^KONtJVhr/U,ruvri. i ::Kf:::;:'^-u:F. 
KiFT^EFtjCKgOA; :a t : :y::vw :i .Tt.T i r Ti .n: v i F:i.F:7rp:;r/:: ^>i^^wr^ k.takaaaaaa 
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VLUEAOPAU t'/TIOHDWHVnLLAGLLKNPLNPVM^ V tHNFCrRCYCSTOLLAASOID 
TMLJIIYOLrROrTrSVLMKCALYCriDYrTTVSL; * ^OEIINOYSDYELHDAIUUtNSVF 
: lU: £DE0VWNPKTDPALAV0YDAJLLa£P0Vt.FTKK£OiRAVLYEKLGISSDYrPLI 
t:;R r vrcKCPEFMKEI t LHAMEliSYAF t L tC7r30NEVU^XFKmX}DClASSPNIRLr 
LDFTiDPLAP LTYAAAWIC I PSHREJ^LTCLI AMRYCTVPLVRKTCCIJ^OTVrPC^^ 
TFFITmNFV EFRAML3NAVTTYR0EPDWI/;LI ESCMUIASCLD^^ 



irr^Ol nvponher. ic jl pr-.r#»in 

3L^XVSYt^NPr>KAWry;SifGF?Mipq/9KtKLV JiPfl[J<4.rx:D7 
ECU/I5 PI EVEXEXCRXO:>TIOW I UU£Uici^ 
VCSQTOniPLIR0EU.L£2DCFEECSO0fV:PERKN r LKFLEDRKKHECNSPFEYL 



KKFLiNL£3GALS3TVF3LSYECRI IKALVKDrOYO ITTYDVIHLDFEELVEDRPtKLNI 
P t RC INAVKTICWLCCSUIOVIRAVRWCKPKDIVPFLELOVRSVCLSQTRKLSOIKIP 
ACrETITPLKEVAITVSRR 

CPn_0950 1086470 10B7027 

pcrfPepcidyl CRNA Hydrolase 

PSLEI3NMAiarVAXGNPRHCYAmmMCFUJVDRLVE£LOGPPFKPL^KCKAUrrLVC5 

SSGPLVFIKPTTFVNt*SGKAVVUUCJCYFWALSHILVU^DVNRSFCK^^ 

^CUCSrTASU;SNEYV«LRFWCRPLEBCVELSN^^a/aCFSE£EMLaU;S 

EWCSKF 

CPn_0951 1087113 1087457 

rse^Sfi RibosGoiAl Procein 

•EFLMCKXENOLYBGAYVFSVTLSE£ARRXALDKVISCrrNVGGEIHKIHDCX:RKKLAYTI 
RGAREXW/FIYFSVS PGA ITEI>?KEYHUCTUJir>rriJ?ADSVKEV^ 

CPn_0952 1087469 1087723 

r5l8-S18 Ribosomal Procein 

C£NH^^CPVHNNEKRiUCRF^nCKCPFVSAGWKTIDyK^>VCTLKK 
RFQGVLSQAIKRARHLCXLPFVGED 

CPn.0953 1087727 1088248 

rl9-L9 Riboaooal Procein 

rKGRRMKCX3IiIXErviX2iCRSGDLITARPGyVRNYLIPKKK^^ 

RLIOAAADKADSERlAOAUCDIVLEFOVRVDPDNNHYCSVTIADirAEAAKXNIFLVRni 

FPHAKYAIKNlXnaCNIPLKrJCEEVTATIiVEVTSIKEYVTVtJU3^ 

CPn_0954 1088259 1088708 

ychB-Predicced Kinase 

CRKVCYKDIMOYFSPAKLNI^UCIWSKRfTNFHELTTLYOAIDFCT^ 

^WNEU^PSNLrV^CSLEI^RRETOIH0PVSWHLNKSlPLOSGLKX;SS^lAATALYAU^ 

FOTHrPITTLOLWAREIGSDVPFTFLOEOH 

CPrx_0955 1088612 1089175 

(Erona-shifc wich 0954) 

RAfWPYSYNNIATLCSRNRKRCSFFFSSGTALGHSRGEHLFSIKKUIHK^^ 
GI PTEKAyOSU.PODYSTCNHNACnfGQroL£iCSVFRIRTDLKMat^ 
LMSCSGATIJVCVLEEI^ODSKVSSOIKSLIKQTCXSrPVSRLYREPHWYSUCOSTy^ 
LECFQPQI 

CPru09S6 1089545 1090909 

CT805 hypochecicai procein 

LWSMILPPYSYSUCIGAAVLFFCSILHTrLTPWLYTt^OSYEHKKLVFPECWKRYAR^ 
SELFRILSRVEIVFFU^JAVPLFFV^rrEGYRISMAYFTlSRNYCFAVrrMVILILLESRP 
XVYFAELVt^SIAKLCKTSPKSWWm/lIAPPU^CLUCETGAMIIGAT^^ 
SRRFAYATMGUJ'SMISIGCLTSYVSSRALFLIFPALKWEHSFrLSHFAWKAIVAILIST 
TrYYTIFRKEFKKFPDI PSDKDPSVEKVPWWI ICVNI I FVCS I ILSRSTPLFMCALLLFY 
U;FOKFTIFY0DPINLSKVCYVGLFYAGLVVrCDLOQ**VLNU40G[^0FGYM^ 
IFIiXULVNYLVKNl^ATOCYHYLWAGCHAAGGLTLVSNIPNIVC^ 
HMCWLFIjCALCPS I ISLCVFWLLKNVPEFLYCFFR 

CPn_0957 1093812 1090963 

ide/pcr-Insulinase family/Procease III 

KIVTR>Kn(MFWCLt£PILICTSI^ITSCECX?rKWPNCX:PLOVSTPAAAl^ 

GLPLIJISOPNLPTSGAALLVKTaJNADPEEYPCMAHFTEKCVFt/^NECT 

SEN>CVHNAFTYPNKT\n^SVEHSAFSDAUX3FVHLFINPKniOEDII)REKYAVH0ErA 

AHPLSIX;RRVHRtOOLVAP<XIHPCARFa:crtASTI.TPVTTEKMAEWrKU(YSPE^^ 

YTSAPLSKAKKOFSKirSOIPRSKNYERQEPFLPSCDTSSLXNLYINOAIOPTSNLEIYW 

H I YESSHPI PLCCYKALAEVLRNESKNSLVSLIJCNEOLITOrXVEFFRSSIJnGErYI^ 

ELTEKGDKHYSOVIOSTFOYLRYIOEHCIPNVTLEEISrrNALNYCYSSKSPLFDLLCXQ 

IVSUaJEDLSTYPYHSLWPKYSSEDKAUJJLVSDPEOAMVLSSKNSEHVreEATOLHD 

? I FDMTYYVKALDCVODYGKVOSLKP lALPKPNLFI PKEVTLPCVHLLKKQEFPFAPALS 

YQDDKLTLYHCEDHYYTAPKLSSO r RIRSPOISRSS POFLVATELYCLAVNDQIXREYYP 

ATQACLSFTSALCCOG I DLRVSCnTTVPALLNS ILTSLPNLEISYETFLVYKKOLLELY 

CN[^ECOKKDYLEMLO\^ASRSSHATKPFYYEU)SOEISE1KHOYPLTANGMLLLLOOK 
^SPSICCKVCAEMLFEWWHITFEEUmXWLCYMVG^ 
EEtXAKTSLFUJICVSASPEKFCISOEKFANIRKAYINKI^ 
PFVEFSTPDUCIAIAETLTYEEFUCYCOCFLSNEUrrOTSVYIRGTQKTS 

CPn_095q 1094803 1093793 

plsB-Clvcerol-3-P AcyltransCerase 

rnAaLANH0TErOPOLMYYAU;KTHPELH£NMrFVACDR\n'SDPlJUlPFS>CK:DLLCIY 
KRHIATPPELREEKLLHNOKSMOrLKTLLNBCGKFir/APAOCRDRKNABCRLYPSEFSP 
£.1 r EVFRLLAKASNOTTHFYPFALKTYDI LPPPPK I EHAIGEORA r FFAPVFFNFCAELF 
FDAU:.';XE£LIHCDKHAOftTLRAEKVFSIVKNLYEEL r r nr UA^^f 

rtiiji't\'t I ()•>»> j7f; 1 0**4 7 04 

';.ire-/uci.il FlUuDtfnc Urotein 

/y:lr:I^TnKWENEILUlrE:JKEIRYAIILK^KX3U^DLTIERKKVR0LKCNrYRC^ 
I J<N to. .AMN I DEREJ* IF I H mO I L£N::KKFEOMFDMOVDALPEEAGEAPLL33EEAP t E 
my.LL::rV[.V0WKEt-Ii;:;KCARLT£:Nr:>IPORYLVLLPNCPimCV!;RKXEOPi(MHE0L 
KML I k::KKM I'XW :L ti:nTAJTTA:>TEAL tNEAHDLLLTWKT I LEKFYSTEOKTLLYSET 

KLtDKATMUK (WU;Ji\;YLrFDKTEAMirr tDVNrxiRrrTOLESCVEETLVOINLEAAEErA 
MJI /'t J^rPVT-i :( .V a OF [ DMKiTRKNORRVLERLKEllMKYDAARCT I U:MSEP0LVEMTR0R 
NI<K:ii4'yrUTU;ryC:X;NA[tKTPECWrEIERDLKKV[MJKEJir;H^ 
iCrjWrU*ll.MIHLAK0l.»O\Ktj0rNTr;D:;vHUiHY0FFCL[TrjEStDL 



CPn_09»>l 

rl I2-L52 Rtbosnm,!; 



I097l0h 
Prnrem 



CPrv_0962 1097301 I09i3275 

pl8X-FA/PhosphoUpid Synthesis Procein 

ILSDF>«E\A}ICI0W<Xa«SPLVVVOVLVDVtJCS0SSTIPFAFTLFASECrRK0I0OT 
SDLPOEICrPKIISA£in^/AMn3SPLAAIRKK3SSKALCU)YLOEDKIi)AriSTama^ 
TlARAKIPLrPAVSRPAU.VCVp™RGHAVI UMIANISVKPEOWCFARMCLAYROCW 
OSKtPTICLLNIGSaaUCGTEAHROTFRMLRETPCEATLCNIESGAVrtXSAAOI^^ 
Ttail nJCTABCVTCFUJRI IjGDKL£AD lORRIJTfrFYPCSVVCCLSKLVI 
LFHGILCSXNLA0ARLCXRIL5NLI 

CPn.09S3 1098374 1103224 

p(iip.31-Pucaciv« Oucer Membrane Procein 

TPUincWfVAKFCTVRSYRSSFSHSVIVAII^ACIArEAHSLHSSELDU^ffWC^ 

AHVEEAQTSVUCCSOPVNPSOKESEKVLYTQVPLTOCSSCESLDLAnANFLEKFOHIJ^ 

TTVrCIOOKLVWSDLOTRNFSOPTOEPDTSNAVSEKISSDTKOJRKDtXTO 

EVSSDLPKSPCTAVAAISEDLEISENISAROPLOCLArrTOn^SISEXDSSrOGIIF 

SGSGWlSCLCFENLKAPKSCAAVYSDRDrvrENLVKGLSr ISCESLEDCSAACVNIWTH 

CCDVfftLTDCATCU3LEAIiU.VK0F^RGCAVFTAflNHEV0NNIJ^ 

NSAaCSN0CAFACCSrVYSNPIDn*ALWKOCAI.SGGArSSASDIDI0GNCSAIErSCN0S 

LXALCEHICLTOrVOCKAIAWGTLTUtNNAVVOCWNTSJ^^ 

VAFKOOTAALTCCaVLSANDmrANNrGEILrEQNgnWHnG 

ENINIXCNSGAITFUQOCASVLEVMrOAEDYAOCGAIiCHNV^^ 

TFWICEVVaXUIIOT>RVTISNMSCDWFK^^ 

pgCSl JlACSH SraflfPPKiVL^VPPSUXEHPVVSSTDrROCGAIlJ^ 

FSGwt/w'jF F5S'iVjiJiArvtyyAr j,friTrevNvcsNCTfl/^ 

VDXSMIHSVEFVSrCSCaCPQCUVCALNESW 

TICOIOGNIArKSfFVFCSENOASOOGAI lANSSVNIOOKACOILTVSNSTCSVCaGAIFV 

GSLVASGSSNFRTLTITQfSCDILfAKNSTCTTAASUElCDSFaXAIYTQI^^ 

VSFYGNRAP5GACVOIA£X3GTVCL£AFX3GDILFE^ 

VDraCyiIPQDAITYEEin'IRfa«PD!Q>VSPLSAPSLIFTlSKPqDtsiQ^ 

KIPOXAAIQBCTlAI^OrUELWlAGUCOETCSSrVl^SZUUFCSQVDSSA 

EETLVSACVOINMSSPTPNKDKAVDTPVLADt IS ITVDLSSFVPBOOCTLPLPPEI IIPK 

GTXLKSKIUOLICI IDPtTfVCYD«AU^HXDXPLX5UaASC3f^^ 

V5Z^ XTPATYGHICVWSESlMEIXaU.VVGMOPTCrrKUfPEXOGALVL^ 

LKOEXFAKHTIAOWffiLDFSTWWSSCUWVEDC^ 

DFLIQCCrSOftUXTCSQSYXAXNDVXSY^^ 

YCTtCISTCairtCIOCT'IACTSXDW 

GKLVKTFUKTKFEin^X PFGPAXAfAYSRGSRAEVNSVQLAYVFTIVyiUCCPVSLrnJCI^ 
AYSWKSYGVDXPCXAWXARLSNrireWNSYLSTYIAFNYEWREDLXAYX^^ 

CPn.0964 1104812 1103301 

No robusc homolog pretenc in Genebank/EKBL as of 11/7/98 

OSXLESIIKYFYLIHNSKMHMSNPXSLPSPAELIAKYNLIPmPIYPWWEUXLnWA 

(OTtLTNVACTVLHPSStraSKKIUlPO^ 

XVACLRIJHPLPPKICIVEDI^EPTT£CrHE\^OPFXFAU>ALLFSXWLItSnCXVGQS>^ 

KAPU>NPFUmLVM5PQES0CAMUCZPOtCS0UCKVIJCSt^ 

EHDSNPDXICrFTILXKLLXEALTCKSSt^KTPSTKEIQIOAALFtASSCirr^^ 

RSLNRLYSIAWESPM OLLIW VQgrKERELHS lOOGOOAEEYRFAAQOHCCRyTEAXEOVL 

RNESAAKLQWKVINTMKFTHGKNtCLVTEHtCOTtfiALTlJl^^ 

FI/nCYLNSCNOLVNSVFXSMOKADPEIXALIREFALDILYASLRLPQTSAm^^ 

OPETYEPNKACIAYLLYVLKXXEL 

CPn_0965 1106769 1104925 

IpxB-Lipid A Oi saccharide S/nchase 

KGF^FSKVCLNMI PSCLVYLLYPIjGFU^LFFGSAFSXOVA^KKRKEVYAPftSFWILSS 

IGATLMIVHGTIOSOFPWU{\m«LirYLRra^ITSSRPISFTUTLVIitAL^^ 

FLYVNMEW4ASPNIFHLPLPPAOLSWHLICCLGIJaFSCRFLIQWFYIESM«Ta3rPLLF 

WKI CL LOG LL ALVYFIRICDPIMILCYGCGrj'PSrANLRLFYKEQRSTPYIimyrrLSA^ 

EASGOILGGKLXOSIKSLYPNIRFWGn/GGPAMROQClAPXIJmECFQVSGFACVLGSL^ 

LYRNYRKIUCTXLKHKPATLIF lOFPDFHLLLIKiCLRKHCYRCKI IHYVCPSIWAWRPKR 

KRILEOHUXlLIilLPFEECl^Km'SUnVYLCHPLVEEISOYKBOASWKEICrXJCW 

VAAFPCSRRCDISRNLRIOVOAFUiSSLSQTHOFVVSSSSAKYDEr lEDTUCAECCOHSO 

1 1 PMNFRYELMRSCDCAlJUCCCTIVLETALfCrPTIV>K:RUlPFDTFtJUCYire 

SLPNI IWSVIFPEFIOCKK0FHPEEIATAIJ3U^HGSKEK0KEDCRJCLCKVKrTOOIA 

SECFUCRIFCTLPAV 

CPn.0966 1108055 1106749 

pcnB_2-PolyA PolyroerdBe 

tXITI IMVCENN I LSGRCLEIXKKKSNITLTFTI YSVSNHNIKLKDFSPHALSVIKTLRK 
ACYIAY IVCCC IRDLLUnTPKDFOISTSAKPEE rKAIFKNCILVCKRFRLAHIRFSICOI 
X EVSTFRSCSTDEDVL ITKDNLWCTPEEDV'LRRDFTINGLFYDPEHEEX IDYTOCVNDLR 
NRYLRT ICDPFTRFKODFVRMLRLLK I LSRSPFTVEnTTQEALIACROELlKSSOAHVFE 
ELIKM[^'X;AAKNFFOLLIENHLt^ILFPYMDKAFRLNPJ^B0TATYLKALODKXLiaCE 
AEYORHOLMAI FLFPLVNFNVR YKHOKHPYLSLTSVFITr IKNFLEOFFADSFTSCSKKNF 
ILTALI LOMOYRLTPLI mKAU-FNKKLUiHTRFLEALCLLEIRSIVYPKLDICVYVAWI 
RHHQTCKCKKOSHSOK 

CPn.0967 IIOH431 U0»»3)i'^ 

mrnA/pgm- Pho::pnrhi I itrnmuc.ir.t^ 

FTAYKFAF rCAi:R.';EK I RR If: t OFRRNM0O^IV^rLFC^/,VRCRANFE^K^\T^PVLLCK 

AVARVLRBf;B;tt;Knf(m\:KUTRL:2;YMFfNAi. r ACLfi:;Mr; t ETi.vLGp t rrrcvAP tTR 
AYRAav ; tM I .*;a::i tNrvKPNi : I K T F:;Ltr:FK r .;f//i,EOP i ETMvr;FJU5pr ;pLrEOHAVGK 

NKRV r DAK;HWKf'VKATFI»K/ IPTI JCf ;LK I VU/TAIKIA.'JYKVAl': VFEEriVKEV [CYCCE 

PTC IN r NEi ici h\t.vi v\- r OKAV ( i-mjAi \u\ I Ai,c/ :d( ;dp t r MV0Rc<nf i vr* ; w r ui tCA 
f;Drj(KK:;ALruNKWATiMmhrr/ixYf.EiUA:t//yFT.':i''/nof(irvtjiAMi>3iEVTiyy:B(} 
linm I FLOYNTTi :u : i v: 'm A,atjn t h r f::K' :m] ::ut,rAt' t vk: :raTi. i nvavr ek t put 

t PL! ERTii* w^^rvvu :r: V :h f I .( j'Y::r rrFi< u ^'/MvnraiKKMyvij *i-\KAu\ w I OAELt ; 

tc:;rk ^ „ 
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0RHCCrFr:YU:N0DCV3IVLEC[>KL£VRCYD5Ar,. . VEOELTtRKTVCTVOELSNLF 
QEPEt ITAo'V tCKn*WATHCVPTEINAHPHVDBCR^CAVVHNCI lENFKELRRELTAOCI 
3FA30TD3E I IV0LrSLYY0£3QDLVF3FCQTL«OIJlCSVACALIHKDHPHTILCASOES 
PL ttX;U;KEETr t ASDSRAFrKYTRHSOAIASCETA rvSOCKCPEVYNLELKKIHKOVRQ 
tTCSEI)ASOK3GYCVYMIJCEIYCX3PEVLEa. r 0KHMDEECHILSEFL3DVPI KSFKErr ^ 
VACC3SYHACVLAlCYr t ESLVS7PVH I EVASEWRRPY IGKDTLC I LISOSCCTADTLA 
ALKELRRRN r AVLLC ICNVPESAI ALCVDHCLriXACVEIGVATTKArrSC'L.UXVFLGL 
KL>NVI(f:ALTHAEOC3FrOGU3SLPOU?3KLt^IE3LHSWAOPYSYEDKFlJU:RRLW 

■ r.y.K\,\i.y.i j-; i a ; i:/^ iw i • ;!.;*k ic :h a: . : . ;t; v r ak- ' jcu ; rr • :km i ■ :nmm tVK 
M "AitvtA£Ar!:.-^;-:u I AAV. a a;-.*:™: •/^v'maya:^v \x.:MEr:x: 

PRNLAKSVTVE 

CPn.0969 1111803 1112999 

cyrP_l -Tyrosine Traniporc_l 

^^VMKESKN PVNMLSMAESILGHVCKIS ICLVYLTLnrSLLlAYFCBCZJNIL^ 
LCISWIRHLCPLCFAILMt3PIIM*CTKVIOYCNRFrMFGLTVAFCtFCALCFUCI0PSFL 
VRSSWLrriNAFPVFFLATCFOSI IPTLYYYMDKKVCDVKKAILrCTLIPLVLYVtWEW 
VLCAVSLPZ I^0AKIG(7n'AV£ALX0A>m5WArYIAGEIJX;FFALVS5FVCVALGVKDr^ 
AKUCWNKKSHPFSIFFLTFt I PIJWAVCYPErVLTaJCYAGCFCAAVI ICVTPTLIVWK 
CRYCKOHHREKOLVPGCKFALTLMTLLXVINVVS lYHEL 

CPn.0970 1113452 1114648 

cyrP_2 -Tyrosine Tran8porc.2 

VYVMSNKVl/XISlilAGSAIGACVlAVPVLTAKOGFFPATFLyiVSWl^SMASCir^ 
KTVMKESKNPVNMLSMAES IICHVCKIS ICt.VyLFLFYSU.IAYFCBGCNILCRVFNCON 
LCISWIRHDSPUIFAILMCPI IMAGTKV^DnfCNRFFMFCLTVAFGI FCAL^ 
VRSSWLTTINAfPVTFUU'GFOSIIPTLYYYMDiaCVGDVKKAILICTLIPLVLYVLWEW 
Vl^VSLPILSOAKlCCrrrAVEAUCQAHRSWAn'IACEIJGFFALVSSFVCVALC^^ 
ADCUCWNKKSHPFSIFFLTFI I PLAWAVCYPEIVLTCLKYAOGPGAAVI ICVFPTLIVWK 
GRYCKQHHREKOLVPCGKFALFLMnXIVlNWSIYHEL 

CPa.0971 1114693 1115415 

yccA-Transporc Permeaae 

ECSHGLVDRDYIODSRVOOTASRVYGWOTACLrVTSCVALGLYTSGLYRSLFSFW/WC 

FATLCVSFFINSKIQTI^SAVOGLFLLYSTLBC>lFFC7rLLPVYAA0^ 

ALVFCLAAVYCArrKSDLTKISKIKrFALIGLU.VTLVFAVVSKrVSMPLrYlXICYIjGL 

VirVCLTAAnAOMRRISSTIGI»WrLSYKI^UirAIJ<KYCN\m*VFWYU^ 

D 

CPIU0972 1116377 1115430 

fcsY-Cell Division Protein fcsY 

RCINNSLLFPSYLVSFUiOI.TUXAMFKrFRNIttXDSUTaCNISIJ3LIEDAES^^ 
CrrOTEEIXMrJUmCKAEASTIKDLITVIJLI^ 

SGKlTTAAKI-WlYYXERSESVHLVATOTFWVAGKDOARl.WANELCCCrVS^ 
ArDCrOSAIARCrrSK\^IETSGRIi{VHGNWaCEI^IVSVCGKAIJia^ 
G^©UIEOVRVFHIW^^LSGLI^^ICVTOSAKCX^I^IAKRUCIPOT 
DLDUXNKLFPEVEKl 

CPrv.0973 1116346 1117527 

•BucC-Succinyl-CoA Synthecase, Beca- 

EGKSKELFMKUiEYOAKDLWSYDVPIPPYWWSSEEBCEIilTKSCaJSSAV^ 

CRGimCWVIVAXSSACrLQAVAmjCamFTSNaTAZKFLPVE^ 

IMDWCHRCPVLMLSKAGGKDlEEVAHSSPBOILTLPLTSVCmiYSYOUlOATKFMEW^ 

VMH(X;VOLIKKLAKCFYErn>VSLLEINPLVLTIJX;EliVIJ)SKrT 

YDPSQENVRDVLAKOICt^IALSGNiarrVNCMClJ^MSTL^ 

ASOKOIOEAVSLVI^ESVKVIJ'INIFCKIKDCSVVASGLVAVMETRDOVVP^ 

NVEZiGKEZVQQSCIFCQFVSSKEEGAIUUVELSM 

CPn_0974 1117523 1118422 

"sucD-Succinyl-CoA Synchecate. Alpha* 

VCRFRRYMFHSLS KOTPI rTOGITCKACSFKTBCXXAYGI^nnroCTn'PCKCXrrLWL^ 
YDSVl£AKOATCKrRATMIFVPPPYAAEAIIJEAEEACIELIVCITECrP\niC^^ 
- NSTSQLIGPNCPG I IKPGECKICIMPGYIHLPGNIGWSRSGTLTYEAVWOLTOLKICOS 
ICVGIGGDPL^CTSFIOVMALEEDPYTELIU^ICEIQCSAEEEAAAWIOAHCTKPVVAF 
lACVTAPKCKRMGHAGAIISGNSCDAKSKIOVLRESGVTWESPAH lOCTVDAVLRAKEL 

CPn_0975 1119038 1119637 

No rotjusc homolog presenc in Genebank/EMBL as of 11/7/99 
3 1 EEOVALS I A I KIUCI ILALI LFPLVLLAWVIRYOLHANFHCSWPFPGFSVMOAYKCS 
EAKI EEMU)UI)L^LEMSSRCLROOWrFANRLEEEL rO ELRVSETEEL ISLGCKRNLVR 
LLLTHFFNPPKRSRVESVGHEWFPVFDRLKREEEI ICDGPITRSNEELWALLOHCTARC 
IHKTLWFSIFFKYLTQIEW 

CPn_0976 1120079 1121185 

No robusc hoflioloq presenc in Genetank/EMBL as of 11/7/98 
ILMLVYCFDPSVPTSPEHRLHAALDRWFFLCGHRARILTLECNHYRAFOENMSISTVEKI 
LKLISYU.I P rVLIALLIRCFLHSRFKCNWKCDSLSDARVPHDVOPFWDFOLFNNOERLN 
IVWnWYVSG lOVLWPVDYLRSOFPCrKEI PEAIRCEtrrVSDCOFSEESJCTSYUO^ 
DIVCYI LSLDCTYVmWILKIRAMC ITFESFPCKEADPWSPRVTHHYFDESWKALARHV 
LGBCTMVNRLDEALI RTEKPGKEXIEC ITKOFLKDYCKKHLEVMSCPDFIESLVDEK I REF 
RCPS rOiSAVCDVI DRKCOEHLLKAI INEANRRL?CMKNSSFTMRGNQVLFYT IFSPPKL 
PPAASSVYF 

CPn_0977 1121329 112340; 

No-coDusc noffloiog presenc in Genetxiink/EMBL as of 11/7/98 
LYItiCFANtLKSSFUiEWSFSPSVRTSFOHRVMAALONViff'FUJGRRUCWSLDSCNSCO 
ACETr/PirrTEKVUCILSYLLIPIVt I ALLIRYLLHSNFTAKVSOKFWLKTLOtJGIDI K 
:;F [LPCCI rVNTWOSATLFKA IRLBCKRVDVEYHRLHSSDKWrr I PAOKLPDDLRLTHWL 
PEKETRKTEVVRHMLAinrtKPrLTS0r;KERU»VVV^DSR3ST5LCAEKVLOYRFrDHPOS0 
OEFORUJ^EN tTTKCSEOKEVV02DLFDMAFgCWWP0FI iJV lOSPTFSEELVHQlSOKLD 
LOC tYPEDDEFEOKFLWLLKAVLHHGFEC ISVASMrwiFLtCPOSLAU}! PFLRNQK 

Nfi cnhu.it. hfimiiliKi pf«.:;ent in OonelMnk/tHPU .n: oc ll/7/'i« 

KY KFMEV/:;i- n rAVRT;;Fgi irvkaaldawffu >.:i irlk W::u)r:cNJorAYOELvs r srr 

KKVI jrLL:;YI .LVP I V r [ ALL [ RCLUi2NFH t DVEKERWf.K I RELG I D I ESCKt.P.';aYVNO 

v::::k [WFh:KnK:;Ki*rH r dvoyhtlh:;kdwwfp r vfok r pkt'irf.svwfsoketrkroyv 
uMMfj^ivi<:7t.r:tix:ijfji^>v£::K'n:Y0::AT:;LprErivi^vi:LTDN0EL0CEV0RLLN£ 

;:ATK:;r/:DKLVr.t^ :|(V:;d f tCOOWPKFLEVrO::PAF r RELVEEV:;nKLNLDFLCLEKAN 

ff4yjKr.RN;:Lt.i*Awniif'.':t7nrt3 tKKvnAijLi r\TEA ryfA^ tPF';R*: 



f:Pn_fl'>7'> tl2j'.<7L 

hcrA-cw JerinB*?iftrt>t1«*fte/' (I .U"-:.' K T 

G t DMITKOLRSWtlAVLVroCLU^iPC^XaAV^KKESflVSE^^ 

ATPAWY IE5FPK30AVTH P3PCRRCPYENPF0YFNCEFFNR FFGLPSOREKPOSKEAVR 
CTCFLVSPDC*r I VTNfWWEOTCK I HVTLHDCOKYPAT/rCLDPKTCLAVIK r KSCNLPY 
LS pr^/SDHLKVCCWAI A IGNPFCUJAr.TVCV tSAXGRNCLH I ADFEDF rCTOAAINTCN 
r^r»PLUilDr7yVXCVKTAZyr^>;V77flC icfa i pslmanr t idolirdcovtrcflcvtl 

. . : .vsiAt\ :'ry:.^:rri' :.\:\"" vv-- "alva i:' : : Av:;::"r,-i.::.. MFT?r«\v.; 
: .'i: ;; : -rn : vr.r.vp r ;rv r " r r~/ : r •? -r \r"r"^ : : a: ^ 

TKGIL: ISVEPGSVAASSGl APGCLILAVNRQKVSS IttLNRTLKOSrWENILxifi/S^ 
VtRTtALKPEE 

CPn_0980 1126988 1125504 

•Similar icy to Sacctiaromyces serevisiae hypocnecical 52.9KD 
protein 

FVMUmAKKKAKIW,rFFSTKDICLSYCDrrFliNCSGKPMNU)SKHFDIKSANn^ 

FISFPS ISADSOHLOOCEKCAHFLVDHVNK IFDVELWETPCHPPI I YASYXSEOPLSPTL 

MLVNHVWOPAOLSDCWKCDPFILREENGNLYAftGASDNKCCXF^ 

PLNIIWLIECXEXSCSLALrivn.EKKKEAUlADYU.rvrX3GrLSEKHPY^ 

KISI^mNKEMHSGVLCGIAYNTPmALSEILSSUWPDNSrAIBCrVTDLALPSI^^ 

LPKStmJlECEEjnXSrRPQCYZASYSPeESAUlPTVErNCISGGYTCPGFKTVIPYIW 

YLSCRLVPMODPDKAAHOVIHHUCOCWSSLKFSYEIUOCSRCWRSSANLPrVKVlOEI 

YSOLYNEECUlLVMPATIPICPLLCEAAQTSPriCGTSYLSDDIHAAEEHFSMDQLICKCr 

LSICOIXDKLPKIKE 

CPn_098l 1127019 1129952 

Zinc Hecalloproceate (insulinase family) 

VTESKKACDTYRNFIIKSCKDLPEIESKLI.£A£HKPraVSIKMIVNND 

P0^SN(;VAHVLEH«VLCCSa^ypVROPFFSK^RRSUffFI^UFTCPOFTCYPAAS0rPEO 

nfNLLSVrrOAVFHPU.TKQSFLOEA>«YEFNSE»«LC:YTCVVFNEKIC^^ 

ALMAMFPSVTYCVNSOCEPREIVTLSHEDVRAFHOSOYSrNRC^^ 

LEEIOXJWATKLEKOAVSVPLOKRrXEPVRNILTYPVDHOEEDKVI^ 

EUALKVLEIILMGTDASPUtSRLXJCSCFCKQTQlSIENDIRErPKaVCK^ 

LEALIFASLEEI IRBGISENrVEXWVKOLELSRKEnCTSLPYCLSUTRSCLlJaWCCS 

AEDCXRIHSLFSELIWSLKKSDYlAiaiRKYFLPffHrARVILLPDTELVA l^ ^ 

LLSVSEIXTDPOCElCIQQWRELTESQEOKEDLMGirJ'MIJir^ 

0G5/U<HECFTO)r^XDV\rtJ3IPPL3GEELPWtJUXVFl/^ 

HTOGVDVSVDFSPHANKKSFl^PSVSIRGKAt^SKSEia^GlVSnWSVDFTDrPW 

UJCHNEALTNSVRNSPMSYAVSMACSGNSrPCAMSYLTTCLPYVKKIRELTIWrOT 

EAWILORLVTKCt SCKKQIVISCSAHNyOOLKDNKFVGLLDyLrvt PEP*#ENPSINLW 

TSnSZiiIPARAAniALAFPIG0ZAYDHPOAAALWAA£ILa«JVVLHTKXR^^ 

AANLSI«;srYCYSVRDPEIArrYKTFLKGVSEIASCNrrKEDrYBCALCV^^ 

GSRASVAFYRLKSGRIPVUtQAFRRSVLENmCEHICMVMDKYLE^^ 

LRNKVLTLDICDFPXVPAI 

CPIU0982 1131315 1129962 

yigN family 

KWaASVMNLPVSLACIiLSGCVrrLGVFVSSSLYARKKRAFLEKIOiaM 

NLSWtOEOLIEDFSNRIAWSHKLimflCEEAC^rfFCtyrSKSPOSILSPIO^ 

LETFgrKHAEDRCRXJCEQI5QIJJ^vaCKLEH£TH\rt,TDILKHPf^^^ 

AGKUCYCDYDSOTTSAOGAFRADIXIIUJ^ORCLIlDAKAPISDSYrSWEE^^ 

IKEHimJCSKSYWEKFHQSPEYVILFLPCESLFNDAlRLAPCLMEIGASSNVXLSSPLT 

LLAIiKTIAYMWKQENLOKOIOEVSt^EUiRWXJVV^ 

SSFQYR\rt.PTLRKFBGLErSSSHOIEEPTPIESLATSFP»rCDIDTNt>VIESLEIC(S) 

CPa.0983 1132045 1131206 

pssA -Glycerol -Serine Phosphatidyl transferase 

KNPtXTOJKiCLMOirMACUJUlARCKWlVVTPNAITAPCL^^ IFKSVLRTSSSVEL 

FHRLOCLSLLLISAMIADrSDGAIARIKKAESAFXWOFDSLSOAVTrOIAPPLrAIICSLO 

GIYVCNrrSSUXmilYSLCCVWVRYNLFSOlCTVDVSKPYCFICLPIPAAAASIVS 

LALFLASDFFPOLPAOUlVCU^FALLriCCLMISPWKFPCVKHFRFWSSrLLVVTIGL 

AACLFFSCLVOHrVEVFFLVSWLYTLVGFPIFSIIYWCKS 

CPn_0984 1132370 1135510 

*nrdA-Ribonucleoaide Reductase, Large Chain* 

CKWVrVEEIWYTIVKRNGMFVPrNQDRIFQALEAAFRDTRSLCTSSPLPim ^ 

THKVWEVt^ISECOVNmrtSIODLVESOLYISCLODVAROYIVYROORKAERGNSSSI 

IAIIRRDOCSAKFNPMKISAALEKAFRATU^XNC^^'PPATLSEIITOLTLRIVEDVLSIi^C 

EEArm^IODIVEKOLMVAGVYDVAKNYILYREARARARANKDOtWEEFVPOEET^ 

OKEnjTTYLUUCTDIXKRFSWACKRFPKTTDSOLLAtJHAFHNLYSGIKEDEVTTACIH^ 

RANIEREPDYAFIAAELLTSSLYEETLCCSSODPNLSEIHKKHFKEYILNCEEYRLNPOL 

KDYOLDALSEVLDLSRDQOFSYMCVONLYDRYFNLHBGRRLETAO I FWMRVSMGLALNBC 

EOKNFWAtTFYNLLSTFRYTPATPTLFNSGMRHSOLSSCYLSTVKDDLSHIYKVrSDNAL 

LSKWACGIGNIwrOVRATCAVl KGTNGKSOCVIPF rKVANOTAIAVNCG^ 

E^fWHLDYEDFLELRKNTCOERRRTHDI^JTASWtPDLFFKRLEKKCMWTLFSPDDVPCLHE 

AYGLEFEKLYEEYERKVESCEIRLYKKVEAEVLWRKMLSMLYETCHPWTTrKOPSMIRSM 

QOHVCVWCSIrtjCTEIUJCSESETAVCNLGS INLVEHI PiroKLDEEKLKETrSIAIRIL 

DNVIOLNFYPTPEAKOANLTHRAVCLC^FODVLYELNrSVASOEAVEFSDECSEIIAY 

YAILASSU^ERCTYASYSCGKWDRCYLPLOTIEUXETRCEHIWLVtyrSSKICDWTPW 

0TIOKYr»tRNSOVHAIAPTATtSNI IGVTOS EEPMYKHLr/KSNLSGEFTIPWrYLXIUa 

KELCLWDAEMU3DLKYFDCGLLEIER I PNHLKKLFLTAFEIEPEWT lECTSRROKWIOHC 

VSLm*YLAEP[X:KKLSNMYLTAWKKGLKTTYYLR30AAT3^/EKSFIDlNKRCI0PRWHKN 

KSAST3 1 WERKTTPVCSMEBOCESCO 

CPii_O0f*5 113S432 1 131^57 1 

■nrtlB-Ribonucleosidft Reduce. ire. i:h.jin* 

tSVHKYCCRKKNNPRLFNGRRLR t L3 ITEKRi:AKMEAO t l^;KLKRVEV;'KKCLV|gCNOV 
OVWLVP t KYKWAWEHYLNrx:AN>ArfLFTEVrMARD E ELWK.':DEL::EOERRVt LLWLGFFS 

TAE:;Lv»';NNiVLAiFKii iTNPEAROYLUiUAKEFJvvirniTFLY h:fj:u:ld£i:evfwayn 
fj*a;: I PAKDOwn.TVDvt j>rMFr:vor;::Et;ij -/yK r knl'a;vy r {mix; [ FFY.Tr;FVM tts 

FIlRONKffr; tllEUYt'Y I I4IDRT rilLNP: lOM WWKEDK'tT/VrTTKIOF.ElVALtEKAVE 

f.E lEYAKDi.-LPRO fu;iJ(:::MPrDYvmi I Ai>«kc.KH fcim vri::HNmm*;(?rHDLNK 

RKNFFETRVTE-f f/FAi rNUM 

'•m_ir»Ht. I i i*./i;r 1 1 r/ »••*. 

yihlM r''*^Lt.'t«fl MtNA M*4 liy 

Fi .i>'MrrcDr^:pi'Kr wK wti* I* ' i .vi .yviK*MYKi:ityMF:.T:r^iKjKKR.»Nirr:: i m'eu: 
:*i »«:LW/VAOAyKUi vviwi Avw..t«Ki.i''/i'K 1 w:;km iNiio tVMi.i« inai rrAr:rFt-v.*YVP 
I jijKi oi< LWNFpririrf(*KMM(mKi w u^ji". :^\vl: i : :ii::u^L::AVKAf ati >i wr^ m j<: : t fa 
I i/n ii^APRMFTfYY r KKrrrr/< ^ i: ii Mirrv yj\\ \ rmiF i kka* : i 
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CPnJ)*tH7 1 1 1749 J IIJHU5 

yt«8-lUe prediccwl cWiA mer.nyldse 

LENCIFAtGFTMFAYRTLLTIIPAn/OVSHEIFKTr/VPCETrvrDATr 

GBCRLWYD igKEALSNALLLTETHLSEOERSVI EMKEOSHEH I LEKDVKL I HYNLCYLP 

KC^mE irnJynTEI3LE:VALN rVRPDCLir/;/Cf PGHP ECEKCTHSVEJLAORL^ 

TVSSFYVANRCRAPRtFlFgRQGSE^SVDKC 



'" • : ; '■ •■••* *•/•■•»:• '.t y'it'.i ■ . mm.- ' f • 

KFFINL INLOOC I LKMKEAAPMHFPFPVRRSVWLNRVSTFRIOCPANYFKAX HTI EEARE 
V I RFUfS I WPFLI IGKGSNCLFDORGFDCFVLYNArYCKOFLEDARIKAYSCLSrAALC 
KATAYNCYSGLEFAACIPGSVGC»rFTWGTNESOISSVVRNVETINSECELCSYSVEEL 
ELSYRSSRFHROOEflt^ATFOLSKKOVSADHSKSILOHRUfrOPYTOPSACCIFRNPEC 
TSACKLIDAACUCClAICCAOlSPUiANF I INTCKATSDEVKOLIAI lOSTLKTOC IDLE 
HEIRirPYQPKIHSPVSEK 

CPn.09B9 1139552 1139016 

CT832 hypocheclcal protein 

LRTStAVKCVU.TrFWIXVMATLSPE)CFSCSPISISKEFPQOKMREriLOMLyAU)M^ 

AEOSLVPLLMSQtAVSOKHVLVALNOTKSILEKSOEIJ3LIICNAIJCKKSFDSU3LVE!^ 

UlLTLFEHFYSPPINKAILIAEAIRLVKKFSYSEACPFIQAILNDIFTDSSLNeiSLSI 

CPn_0990 113999b 1140440 

infC-Iniciacion Faccor 3 

SVAWKmOIRAPKVRLIGSACEOLGItAlKDALDLAREAGLDLVEVASNSEPPVCKI 
MDYCKYRYGLTKKEKDSKKAOHQVRIKEVKIJCPNIDENOFSTKLKQW ' 
OiFRGREtAYPEHGFKVVQKMSQGmiGri^EAEPKLAGRSLICWAPGTVK^^ 
HAODEIJO 

CPt\.0991 1140394 1140612 

rl3S-L35 Ribosooal Proceln 

KORKNRKSLMPKM}aTncSVSARFKLTASCOLKRTRPGKRH}a^KKSSQEKRN^ 
KGOVGHYKRKKLV 

CPa.0992 1140622 1140996 

rl20-L2O Ribosomal Protein 

GKLVMVTtATGSVASRRRRXRILKOAKGFWCnSRKCHIROSRSSVMRAMAFVYMKRKDRKGD 

FRSLWIAiUiWASRIHSLSYSRLI^CLKCANISL^mXHLSErAIHNPK;FA£I^ 

LEAIV 

CPn.0993 1140975 1142030 

■pheS-Phenylalanyl tRNA Synthetase, Alpha- 

KSPGSHSLGIRISMEWKEEIEAVKQQFHSELDOVNSSOAIJ^UCVRYt/aaCC 

LKQCTDKAla/;SLI^©FK^YV^3^>0EKSLVLLASEQAJSJtf•SKEKIDSSLPCD 

HILKSILDDVAmirVHIiSFCVREAPNIESEANNFTIXNFTEDHPARCJMHOTr 

RTHTSNVOAREUCK00PPIKWAPGl^FRNEDrSARSHVLFKQVEAFYVDHN\^ 

ILSAnmSFFOROTELRFRHSYTPFVEPGIZSTOASCECCGKOCALCKim^tt^ 

HPOWJWaA/DPEIYSGYAVCMSIEIUAMIiCYGVSDrittJSENDIJ^^ 

CPIU0994 1142371 1144440 

CT837 hypothetical protein 

UWfllGGRMKRSRRNFEQAI£NlXKrJCEISLATSNDSYUWPARFNQRK^ 

EAUOJVEMfLI^ISCVSKSHAOKAUCESDFLIACTWVTSFLENOEDLYXStX^ 

KAYDEVKKNLKEVPTYDLS i'Uti'I'UaiKEPECrUJNLVEViaiDRSYELrn^I^EQOKRfY 

NDAI,VOIIYKO^«aJ^ETWBCDPLTKTUiWNSEEVKNIASSLVIV^II»W 

LDIEAVVWWIAVMAI^FSRYEATOVFKSPKiaWIWYFNDrLLFUU^ 

EJUtQTKlXASALSUSIFESKLVFEEASRYLYFNIQTKLENANGKKPLSPCOYLT^ 

HRLlSKYPNGPLFKAMimVLEKESRPYDPMILGIlJSLECTriCIiKSItSIDIIRSPSPVrO 

SSILYANCNEEFU^FLNAJCAHRSEVTLVLNIONRISRXERARSRVrEEALEOEEHAPYVH 

AFSFPEPEEUjONIXSIHCDIETFADFFSILOEEFHKPLLASSFFLTKELKEFVCSFUCE 

KLTAUCDIFFAKKKILFRNDXLLUilLLSYLIVrKLIERTNPNSrVVVSK^ 

ACFAFFSREAFVOEHSLKTiLTSn/LSPTLVARDRLVrVSHIELLSKTVlKriJC^^ 

LKSFFKODIECWEFTCYLHELTEVSHKHNL 

CPn_0995 1145515 1144415 

CT838 hypothetical protein 

RMLIWKRHtLTRFWFALTSLLVIALIFYASIHHSLHnJCGASTAASGASVKLSILYYLAO 

rSLKAEFLHPOLVAVATTS7U^AMC»«REIItX0ASCLSLKSLKHPUI*SCAVIM^ 

NF(>^ri^PICEKISITKEN^IDRCT^DKBCX:KIPAI,YIJCDQT7LLYSSrEPKTLTLN^^^ 

KDPKTirmEKLAFTTt^LPICLNVTOFFANDSEWLELKEFFOMKEFPEIEFNFYENPFS 

KLFSAGNKNRI^EFFKAIPWNATGLGOTOVPORILSLLAOFYYVLISPLACMAAI ILSA 

YLCUlFSRTPrVTtAYLIPLGTVNIFFWUCACIVl^SSSVLPTLPVMAFPLIVLFLLTN 

YAYAKLQ 

CPn.0996 1146592 1145519 

CT839 hypocheclcal protein 

AMP I L>;KVL I FRYLKTAAFCTl^LIC IS I ISS WEIVAYIAKDWtyrVLRLMAYO I PYL 
LPF r LPGSCFVSAFSLFRKLSDNNHMTFLRASCASOS I IMFPVLMVSGAICCLNFYTCSE 
WlS^CRYOTCKEIA^«AMTSPALU^LOKKEN^m^FIAVDHCAKSKFD^A^rVALKCNN 
tSHVCI IK5I IPOTTKDTVKAKDVVFtSKLPDSLTESSSPSSORFYIETLDElXIPKITS 
TLFACKSYLKTRTDYLPWKOLVKOSLKHSHLPETLRRVAIGFLCITLTYACMILGIHKPR 
FRKSIALYF t FP ILDLILLIVGKNTKNLPLAfMLFVFPOLVSWWFAARAYRESRGYA 

CPn_0997 1146699 1147^64 

me&/-PP-loop super tamily ATPase 

AYKM\^S0LLR0DK0LDLFFASU>VKKRYaAl,^3DSLFLFYLUCERCVSFTAVHrD 
»JCWRSTCAOEAKEL£EU:/\RBCVP 

AOC lFLAHHAND0AETVLKRLLECAHLTNLKAMAEft5VVEDVLLLRPLLH I PKS3LKEAL 

oar(ujyloop::nederylrarmrkklfpwleevfcwiitfplltlceesaeljevleko 

: ATLPitRNK I V r t KrGVW I D 

*:\tijyj:m rM7Hii usosR^ 

tr::ii ATI>-i|{.t>!rHli>nc ::inc L>t(>c«^.ise 

LL':rWKr>1:;K0KKMKPi:rKKNFmrFFLLP.VV-F^r/'/AmNKI^:KK,\n'vr:F:;H0£Dt 
M. 40hP /-rru .R: J^KALRTY' :::0LYELICKYL: :rvi / : rr -CTLKRErJCPLYOOVEVSLTO 



PD0PRNLVLEKTFK50EP3P. . -FTFLF : ILVLLFVT'-VrrlRCMRCMSCSAMSPCKS 
PARMUJCCONKVTFADVAC : £t**i<EEL r E :VT3FLKNPNKFTSUX;R r PKCVU. ICPPCTC 
CTLIAKAVSCEAD^flPJ^[KC3DFVEMF\CVr;A3Dn;WfflFE0Aia^ 
RHRCAClOCGHDEREQTliNCU.VQflWraTNEC^ 
^rtOILPDIKCRFEIL^^^O^AKRIKtJ^T^^LMAVARSTF^3ASGADLEN^ 
TAVTAVDVAEARDK\a.-/CKERRSLDlDAEERKTTAYHESCHAVVCirVOHCOPV^^ I 
PRCt^LCATHFLPEKNKtJS-rWKKELYDOtJKV-LMOCRAAEEirLCOrSSCAOCOISOATKL 
VRaMVCEVir>ISPOLChr,TYOERSDCLTXr\fCCYHEKSYSErrAKT:DTELRfIL^^ 

..: : ::;:;HKA;::^L>r^:rL:>:7rr: • .-J-vxerMrirnvrr v i: -;?; T-LFKKJjrcL 
"fT-ftKrr"*.r 

CPn.0999 1152859 1150766 

pnp-Po ly r ibonucleot ide Nuc ieot idy Ic rans f erase 

OETrMNFOT 13 INLTECK I LVTETGK I AROANGAVLVRSCETCVTASACAVDLBDICVDrL 

PU lVDYO EKFSSTCKTLGSriKRBCRPSEKEILVSRLIDRSUtPSFPYRLMCEiVDVtS^ 

WSYDOQVLPDPIAICAAS AAIA ISDIPQSNIVAGVRICCIiyfOWINPTKTEtASSTLDL 

VLACTENAILMIBCHCDFFTEE0VU3AIEFCHKHIVTICKRWLWEEVCKSIWLSAVYP 

LPAE\a.TAVK£CAQOKnXt^IKDKKVHAATAHEIEENILEKLOREDDDLrSSnirKAA 

aCTUCS0TMWU.IW3REIRAa;RSLTrVRPITICTSYLPRTHCSClJTR^ 

CSEAMAORYn)UCECLSKryt.OYrrPPFSVCEVCRIGSPGRREICHCKrAEKALSKALP 

0SAT^PYTIRIESNITES^CSSSMASVCOCCIJVL^^DACVPISSPIAClAICLILDt^^ 

ILSDISGLEDKUracrKIACSCKCITArOMDIKVECrrPAIMKKAt^AKOGCm 

fCTCALSAPKAOLSOYAPR I CTKOIKPTKIASVICPGCKO IROI lEETXT^OIDVNDLCWS 

ISASSASAINKAKEI I ECLVCEVEVCKTYRGRVTSWAFGArVEVLPCKECLCHISECSR 

QRIENISOVVKECDIIDVKLLSINEICCOLKLSHKATLE 

CPtulOOO 1153193 1152891 

rsl5-S15 Ribosotnal Protein 

SAFAAIIUlRHPMSI^KCTKE£nTCKF0UiEKTOSADV0IAILTEHIA£LKO0Ja« 
OONSIUAIXKLVCORRiaiXYI^STtTrraYKNLITRZJaJUC 

CPn_l001 11S3369 1153869 

y£hC-cyto8ine deaminase 

YYlXLCCEICLrNMEKDIFFH»ArKEARKAYO0DEWGC:\mn^ 

DATAKAEILCIGSAAODLD«WRIXmVLYCTLEPCLMCAGMOIJ«I^^ 

ACCWWVNIFTEEHPFHTreCIOTrcSEEAEHIiaCKFTVEKRREKSEK 

CPn_1002 11S3844 1154089 

CT84S hypothetical protein 

KSAERKVKNKIVTLLDQLYEDQESRWKLCEEIVPNLTPEDUjOPMDFPOL^^ 
SGVL5GICEVRAAILAALSQEN 

CPn.1003 1XS4S62 1154092 

CT846 hypothetical protein 

TSWCriHPLI>CPOR0IACKASMRVIFPDKHNNFPNLSKUJaa.PSmVTSCIAPFrW 
IIMOTGIPCIXEIlALSVKGIOKHHFWOrLTYPLITADSt^LIWDOSFEnQRLLLI^ 
LDFFLTYKAIOKLIRKLCAFSVtWISCOALI ICAVLWCFMALIHSSOSFTOPESIICCV 
. XTVOlFI^PEKRFTIGPTPLSVSIlCWCrLFVLGrYCCILIFSGArtXUASKLAIVLML 
FCKKEICIPNPYTTSLRr 

CPa.1004 II5S418 1154879 

CT847 hypothetical protein 

HLSIEE2>^SI0PVS^^T^KADKVIP0ST)a^ISDSrTINK0SAFyfCrSVKLI«5^ 

GKSIIAVLEI^IVQOQRVXELINLPUJCVPDIOKKIXISOOEYXNONEZG^ 

ANRQHIQQEI^SAOORAOAWKSVNSTTIESWILQATSSHLSTLKELTIIUWLl^^ 

CPn.1005 1155957 1155415 

CT848 hypothetical protein 

NRKPVRLNMWI IDPLSAKK^LOAAI^n/PCTP t WPNTATADDI lAKFSKDSKPLIVTVY 
YVYOS\aVAO»n^IIAOELOANSSAC7rYUWEALYOYVSIPKNKUIW4SSSYtONIOS 
CWOAICASROAIONQ ISSLGNAAQVISSNLWrUNNI IQOSLOVCOALlOTFSaiVSLlAN 

CPn_1006 1156493 1155990 

CTB49 hypothetical protein 

TKVNFFIMS ITTLCTXPTVOTINSSRPPLEPLNTPK ICAVLFS lYEtXLOAlEIROCTrVL 
TQSQOLN»m«IOOOLWETN0IKYAIVSAGAKEDEITRV0NON0I*YSAORSNIODELVT 
TRQbCOIILSKASTNINIIOQQSSODSSFIIClTMSICSTVNOUiKPU; 

CPn_l007 1156689 1156907 

CT849.1 hypothetical protein 

LWYKSUACEEKOVSGNECNDYPEVFKDDVSAYVLVTCCOMSSECKtOVEmYBGD 
YU.TKARDSLOES 

CPn_l009 1156904 1158223 

CT84950 hypothetical protein 

VLNYSFtCMEJCPMY\^KRLYRWVN0LIKU:DLVKNSR3FSVEWVFISAUI,ITO 
.WKVSLVPFIXLFSFLAFrLILCFRCKCYALtXCVr/TLYVAKYVVCETLYVSFWLSCL 
'WSFLI>FCLFLOCM<(l-\CEEEKVKCKEOLRLSEOLCIAORSAYEDLLLTKSOEKEFU^ 
AOCLDRELTECOEU-KAAVOKOEYLT IDLKILAD0KM3WLEDYAELHNKY I ELVSKNCOV 
VFPWVAEPSVCES0C3ERVDVSRWVSAL0EKEESLERLRNEILVEK0RCSDYEHRC0ELC 
LLtONFTALERRCEELCNLUJOKETO INELHOLVCKSEEJCVSVEPSAHAETSCVEEKOYK 
GLYSOWroFLEKSETLSU-RKKLFAVOEKYLTLKKKEELTKODISFDOISMIOCLLERI 
EILEE?/3H£>EELVSRSl^L 

CPn_100'» H5908S 1 158 186 

tndp<Mf>r:hinntne AininopepCLdase 

YRLUm7tU1KRMDPa*X:;.;RICWKOCHYPOPPKMSPEALKOHYAS0YNtLLICrproitAK 
I YNACg CTAR I LDELCKAiJOKCVTTNELDELCOELHKr/DA r AAPFKYCSPPFPKTtCTS 

r jtEVf cir: t pnd r plkdjo imn tov.-/: rvon-nrcDCCPK/M tOEVPEiKKKtcoAALECL 

UU-MA\iyVZ I rii.'EUIFArFJVRADTY' IFr^WOOP/^irrAU EHIENPYVPHYRNRSMIP 
t ^KH r ^T t El-M t ttVi :k K UnrVDPKNf.WEAff rCWHiP.'JAOWEirr IA tTETWEI LTUilO 

';i-tt_i'»i'j ii''-m:7', 111101.7 

'"Pi',:: i*y|H*t ht.-r u'.il pio(*>ttt 

•/M^^.riH.■:IJJ•""yt.F^\:l^.::[l-/^vALLKNK::RKKV^PvrI.RnM.KA^yv^LILFV^ 
rspfjyurjfM.wAyfj i fvi \t-u .h-r/:: t KMMr jvcHPEKAKDr/rnKTF.p r FFPiw\FpvrTr:PA 
7(tauj;ymkiv; : y:;iip:i r yv.w t f AWAF.':l.^Tr,l/::;::FFnF!^^T;N(•T:LL-'u.ERLw:[AL 
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'TM'»t iiyrtornecical proiein 

I^MRLJCNY^M lyF^FFLPCTC r L.1IJUD:;LTNIIJUj .. .-u/WJVKORML^^ 
AMFALY<^t^UX;LK\rtJn"PVCA IEVVtX;iAVTLJ^C*^RAVtJlLJKEESWI p-rKTOM 
PCr.'^PtALPLMFGPSC 

^Pn.lOi: 1152220 U6042I 

'/2t:0*ADC rrinsDorcer oennedse 

A r F.'iL rrrTKMKKK F : FYr/ r/r.':LLFUrt3fr.'?RKP PTrr^FF^ppprir: I A.^TTLOSLPUX 



'■T f.t r,;' fv ' I ' '. .."Tr^s." . **.':■ 



M I FSCLK : A tG3 ACFAA I ACEWVASOSGLCILHLESftRNYEHaJVFACLATLS ILTLSLF 
■3 ITUI E3CLr FSLFRVKRMSLKHKSVAKKALSVIJ^ r P IMLI PWKCNSKSPPOKKNLTSL 
TLU.OWTPNPNH I PLyACVWCCYFK0HCU)L0L0KNTOSSSAVPKVLFEQVDMM.YHAU; 
rMKTCIKCMPIOrVCRI.IDSSLOGrLYRSODPIYKFn)UCKVLCFCUffiSRDU^^ 

TG POL! VFTKKGTKASEPEIVEArOKALOESI I FSKDHPECAFKLYAKETKS I PKNLVQE 
YLOWECTFPLLAQSOOPLSKOLVDKLLrr X IXRYPEUASEVAXFSLNDLYNPSLFEEQSV 

CPn.1013 1162209 1X63624 

CusC-Fumaracs Hydracsse 

RENSI2mRCNrDMR0EKDSU:iVEVPE2)KLYGAC7mRSRf^FSWCPEI><PrE^ 
KKCAAOANODU:FU5SKHCCMIVAAAOEILECGFEEHrPUCVV<7rcSGTOSNMh^ 
NZJVIRHKCX7^LC5KOP:HPNDKV^0C5OSSN^VFPTAKHIAAVISUCN^ 
DAKVEEFRHCTVX ICRTHUlDAVPrn^COEFSGYSSOUUtCLESIAFSLAHLYEtAIGATA 
'/GTCLNVPEGFVEKI I HYLRKETOEPFI PASNVTSALSCHOAtVnAHGSLATLACALTKI 
ATDI^Ft^SGPRCGU:EIJFPE^;EPCSSIMPC^CVNPTOCEAUW:AaVt£N^^^ • 
3R(^FEUIVMKPVIIYNFL0SVDLI^ECKRArSEFrWCUCVN^^ 
IJ^PVUrrDKCSKAAUCAFHESISUCEAaJVLOYL^EKEniRLVVFENM 

CPrul014 I16S4S6 1163732 

ychM-Sulface Transporcer 

ALASTLCVCIVKVWAFKNFIPKLYTSIKBCySFmTKKDrOAGrTVGIt^ 

CNraVSP rQCUASIIGGLIJVSAHXSNVLISGPSSAFIS ILYCLSAKYGAEA^ 

CVFLIAF^SLTCLCT^^KYMPYPV V T GL r^ GL AIII^SSQIKD^^a^Q^CANIP 

lAVVTOLWTiroSICSrAVGLrTLLIKrYFRNyKPRYPGVMIAIVTATTLVWL^ 

SRYCTLPTAIPLPKIPOl^ITKILOUffDALTIAVLSGLETII^VVAaafrO^^ 

QLVA0GVANIG7SlJSCIP\rrCSt^TAASIKSEATTPIACmSIFICFIUiLAPLTV 

KI PLTCl-\AVLrLIAW>WSEIHHFIHUTAPJaCDIVVLLTVT ILTVMTTITAAVQ^ 

AAFLFMKOMSDLSDVlSTAKYFDKDSDFI^KAEVPONTEIYEIICPFFrGIAORU^^ 

01 EXPPKI F ILCmWPTIDASAMHALEEFFLECDRCXrrUIJJtf^VKICrPlJCJOT 

ELICVDHIFSNIKSALLFAOAL'nJLESKrsrRHLV 

CPn_l015 1165550 1166893 

CT857 hypochecical procein (possible IM protein) 

KNNYKMFSFFTSVRVRSIOroHEirLEVTMIJXOLCArjXrGYIJ^ 

AMOCaiJWLVCFSHIPMADHMILVEEIAI»«S0VIFriJ'SAMAIVELinAHKCFSVIV^^ 

IQSRTTitWAUCLSrFI^AAU»JLTSIIIIISIIJCRL\^^ 

AWTPLGDVTTTMLWINNKITSWGIIRALFVPSLVCVLVAGFCGOFrUW^ 

WSAPPKSU«IFIGU:SIJJ«VPVWKACU2J»PFHyUJ^U; 

HLRVPHILTKrDISSrrFFICIUAVNAIJFANIiTDFSi;«4DKIFSRKVVAmCI^ 

VU^tWPLVAATMCWYTLPUJITrLWKLIAYAAGTCGSILIIGSAAGVAfTi^^ 

KRlSWIALASYFOGLrSYFVLESLNFFI 

CPn.1016 1167027 1168898 

cr85B hypochecical procein 

KRE^TMKKGKLCAIVrGLLFTSSVAGFSKDLTmiAYODIWrEHLISLKYAPLPW 

FCWDWOQTQOARWLVLEEKPrniYCOKVLSNmSLNDrHACrTF^ 

LSEIXaJVrvVDVQTSOCDIYUJDEILEVDGHSIREAIESlJlFGRCSATDYS^ 

SAAFCTAVPSGIAMZJCUUlPSGLIRSTPVattmYTPEHICDFSLVAPLIPEHKPQLPTOSC 

VLFRSGVNSQSSSSSLFSSYMVmVEELRVQNKORFDSNHHIGSRNCFLPTPCPILWEO 

OKGPYFSYI FKAKDSQGNPHRICFLRISSYVWTDLEGLEEDHKDSPWELFGEI IDKLEJCE 

TDALIIOOTHNPGCSVFYLYSLLSMLTOHPLDTPKHRMIFTODEVSSALKWODriEDV^ 

DEOAVAVIjGETHEGYCHDMHAVASLONFSOSVl^SWVSGDINLSKPMPLLCFAaVRPHPK 

HOYTKPIJ>!LIOEDDFSCGDLAPAILm«RATLrGKPTACAGCFVFQVTFPNRSGIKCL 

SLTGSLAVRKIX:EFIENU;VAPHIDtOFTSRDL(7rSRFTDTVEAVKTIVLTSI^£NAKKS 

EEQTSPOETPEVIRVSYPTTTSAS 

CPn_1017 U68997 1169935 

lycB-Mecalloprocease 

VI rMRKLILCNPRCFCSGVVRAIQVVEVALEKMfyKPIYVKHEIVHhnWVWA^^ 
•/EELVDVPEGERVIYSAHG I PPSVRAEAKARKLIDtDATCCLVTKVHSAAKLYASKCYK I 
: LIGHKKHVEVIG IVGEV^EHITIA/EKVADVEALPFSSDTPCFY ITQTTLSLDDVOEI SS 
ALUCRYPS I ITLPSS3 ICYATT^mOKAUlS'/LSRVNYVYVVCDVNSSNSIWLREVALRRG 
VPADLI^^JPEOI CHTJI VNHSGD lAMTACASTPEDVVOAC IRKLSSLIPCLOVENOr FAVE 
OWTOLPKELRCS 

CPn.l0l8 nb?3«5 H70629 

No robust homolog present in Geneoank/EMBL as oc 11/7/98 
RMSYFNYOKNSWLRSLCLLAKFFSRLLYRVFFJFRECIYLFSSLYUa-pRLFFYDLCKY 
VYSLRHCPYAKLGRI.?CASLUCECNVYCETPWSVXAXICQAFDrTSODILYDtjCCGLCKV 
CFWSHVN^COVIG lONOPHF r RFSSlMiRKI^SGFALFDTEEFKNVVLSOASYVYFYCS 
GFSRRLLNEI I LKL5EMAPCSWI S ISFPU)SF5RCXECFFTEKSCSVRFPWCKTI AYKN 
CRKC5 

<;fTi,lOlO 1172146 117063S 

*.TKhO hypornfiCical protein 

1 HPPiUMTVCYQS T^rrPPPECEFDI ^VDC^iATEE.^V/AAEVOVALPAGEOYAMLRATSEt. 
• 'FC rLTy.';ECALfVALPPKEKPU?EEOFLVKNC ILMP3T3LPNLKPC03C0TGLASHRNP 
WC»:rrj::Nrrh;KAL'TErPaS3FPFFSCIO\PECD:ISVDKTFTVCVCrrPKA0E'COEASAS0 

::OAu^lVR:TY:;l•sT^KFJt;:AKEKV^:ObTK:-«^Er^OKHTvrK:;DATu:PMJLy:;^ 
Au::rrr.rj,\}y.tjm iHiX/R0OECYE0E0ECEE';KKKTPwrrTVF-';uif rf^c-i'.ur/reEsrTPi 

rrDriVEFAL;:E::0I^7t^V.KnVTriLDVLP.[C7ELMK£>(U(:;RAN0rrKrRLEERELMERE 

*\i(Ei^:v:;RyAKYAkw[* UATATu: iu:a lAn^riEi-c.nr.uj :fvck r:-':PFKDATAK 

•PFKW: I. ;KVf-"r::u:CI/rM/VV iKV1IF.L::E::AVRAVAEYRKI?yFr'Ml(UI>EVTRT fEEVKONW 

K::MijfiFiJ^lUymiu\Ai(::i.-,v 
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DSEOOELLCSRREER.-EyY/^ JEKKIETKVC IKOLCKCLFrKJOCOSNCFOICKSPFOC 
CTT JRKNR lAKAAOAVPV r P PP j ..;VTTl^*YLLTKCX: r :-J DF JSYCXHKOr/ESTORCUU 
UfEKRCETIKVSXCKEKRESCMISaSDSZCWLAKdf^trQ [&r\^ltf03C:&FmiClTAC 
LISLVIKCLClCIJKFVDWLEKHtPrKNEELRRKirTt lOfaWYCt WTCS i t nXWM JST 
.I P I IBCAIKC lOPAI ESTMAALRCAILFSOAEI YKLKCKLTK lOLO r EUCSFORDOKYER 
30ELLONMESSFEAL3RIUIYHREL00VYLHSLRC 



CPn_in:i 



1174270 U7169K 



. •':n:.tm:::>(i^ V/vri-^/u WLMJPY ri'nAfH;y;i.:;::^T.K r:;v:: frvAi vWk:;lpkfft*3K 
ro.'V.KriKi'i-rTtiitKTF r i atprfh ri.flK( :::::KF^y^i.MrfP::(jAt.ri*::::ivwLF.';oKN.'rrEA 
: :KAf .TMfK: :vhy.r: :f:KAt.DKNi.::::Koi-7::\rKKFr/pf j ii ilk lk. rr/05;LY.';o.':L 



TeOK I FKIiSEDLEK- -rKtXJYHAYUDKOYAXJ XnTkWLVFFNPFVSKniF3LGASUiM5 

EOYSOAU*AYCr/rAVlJUJKDPYPHYYAYICYT^TNEHEEAEKALEIlAWVR« 

KECILOIRKHK 

CPn_l022 1175709 1174216 

CT 863 hypochecical procein 

FSFFFYAIJCLOIKNMPVPSAVPSANrTUCEDSSTVSTASCIUCTATCEVLVSCTALEX^ 
SrnAWSLAZjCOriLATOQEUXOSTNVHOLIJXPPEVl^^ 

EPOETQTOSRSEQTUWSSSKOSALSPRSLKPEISnSKOQOALCTrPKDSAVRKKSEAPS 

PETOARASl^ASSSSCRSLPPOESAPERTLLEQQKASSFSPLSOFSAEXOICEALTTSICS 

HELYKEIU:0DR00RE0HORKHD0EmAESKiaCKKKRGLCVEAVAEEPCHlttJ>IAALrrSO 

0MRPPAEETSKKmFK)aa:j>SPMSVrSRFIPSKNPLSVCSSIHCPIOrPKmwrU^ 

KI>tARIL00AEAEANELYMRVKQRTDDVmT.TVLISKINNEKKDItWSENEOQCA^^ 

KEICVTIDKEKYTWTEEEKRUJCEIiVQMRKmiEKrTOf^^ 

VLKLLKELMOTFIYNLRP 

CPn_1023 1176008 1176331 

No robusc homolog presenc in Cenebank/EMBL as ot 11/7/96 

CUJFIXrFIMOCVVTI^lIFrATYCASEI^VTWAVPLSEAPCKroVRPWCMrOEEO 

CSVPYSFTifPYOYCyYYPETYCYTiaTOOESRECYTRFEIXrTir^ro 

CPa.1024 1177317 1176334 

xerO~ I nt egras e / recccBbinas e 

IFFFPWFSirSUCIAPLPILXIitSIJVSKrMPSTOFHTTIl^FSLFLSVDRClW 

YR0OISSrLTISAISSPOOrS0NSVYIFAEELYRRXEAETTlJ«RLIAUCVrriJtKDQO 

LLPYPPl IEHPKIWaU*PSVLTPOEVI»IiAVPLOME»JPRHIArR0rAILm 

VSELCDLRLCHVSDKIRVTQCCSXTRLVPLGSIUREAinAYLCPFIUJOY^ 

FLSTRCHKLERSCVVnUlIKWrAKOVTSKPVrePHSLRHAFATHLlJlNXA^ 

RIASTEVYTHVAAJDSLIEKFLAHHPRNL 

CPrx_l02S 1177266 1178879 

pgi-Gluco«e-6-P Isomcrate 

GAEQFSSYRElCrHERKRriDCDSrKrLOELALMPLDLTAPGVTJgAPP rnrgCT TiTQgfTF 

SFATERUJOAILAALISCAEERCLHESMLAMOOCOVVNYIEXIFPSEHRPAUrTATIUttAr^ 

OSSFTCEAEDIAVRSRVEAORIJa)FLTICVRSOrrrXVOIGICXSSELCPKALYRALiUrc^ 

TOKHVHFISNIDPDNGAEyUTTIDCAKALWV^^ 

ia3HFlAVTCBGSPMDOTCKn.EVFHI>/ESICWRFSSTSHVOGVVa3 

ASAMD0£ALQPNARQaM.SALISZVINRKFLCnPrEAVIPYSSCLCrFPAKX40CCKES 

NCICSIAOreRRVCFSTSPraCEPGTNGOHSFF0CLH0GTDriPVEriGFBCSO«5EDIS 

FQGTTSSQKIJ'ANMIAQAIAIJUaSEmWNXWFIXajRPSgVt.Vgc^^ 

CPIU1026 1178961 1179137 

ICUA 

CSPGFSICICnSRMFPXAVRSRCFUIKCILAAIUCCKOVVKSTAGAWXCS^ 
CPruX027 1179172 1180755 

• No robusc hocnolog presenc in Cenebank/EMBL as of 11/7/98 
fMPOSVSSPPLSPVXVRERVPSSSCSDLIOPHAVLKISILIFALVTILCIVLVVLSSALG 
ALPSL\rt.TVSCXIAIAVCLICLOILVTRLILSTIRKVnAI«ra)AAVICEEOYL^ 
EllREZRDR^^UkVEZ30CAHLSCENKDUU)P£YUK>frERLIASLCZEN0ALVA^ 
NASl.SRDFRAYKQKFPU;AI£PWKE0IACXME(9<LrUCPECIAMVKSU>lX^ 
GFQSLVNRFAPRSRFrOTPKYEYWSRWEJIEDCaCVAAVCARrJCKFCT 

ICERAVALK£TLPLPEAVYEffLVOErPNU.TAESU«CEWCrrSYPYIJlPYLSVDYC^^ 
VQXJTEELCUCUTTCSPEDOALVRIJ'SYYRmiPAVLASFCLPPPETroSVF^ 
LLWSQIEVlATRYUCDrFVRNSEJmJSFDWSYNEMCKElSBCRIRPAEDrFrimSE^ 
PPSPLSEEGECEEFLPPCSEEEVSVLERPDLOVDSKWVWHPPVPKCPL 

CPn.1028 1180995 1181999 

ndhC-Malace Oehyrogenase 

FFLICCVRMAFKEWRVAVTOGKGOIAYNFLFALAHGDVFCMjRCVTOtJlIYD^a^^ 
CVRMELDDGAYPLLHRUlVTTSLNnAFtX; IDAAFLICAVPRCPCMERCOUJCONOOIFSL 
OGAALNTAAKRDAKIFVVCNPVNTrO/IAMKHAPRLHRKNFHAMUlUXjNRMHSMLAH^ 
EVPLEEVSRWnomSAKOVPDFTQARISCKPAAEVICDRDWLENILVHSVONRCSAVI 
EARCKSSAASASRALAEAARSr FCPKSOEWFSSCVCSDHNPYC I PEDLI FCFPCRMLPSC 
DYEIIPGLPWEPFIRNXIQISLOEIAOEKASVSSL 

CPn.1029 1181987 1182844 

No robust hooolog presenc in Cenebank/EMBL as of ll f/JB 
RVFVrSTMLWCVSMROSFOELSONAFKHIFNXORFCFIFCSLCCFGFVFALFLXLCSRLA 
PE ISLSTLCLGAFFCAFSVICASAI I VOFLLHKESOCETSKLCCAI WfrwsSLWLSLLVS 
MPFF lAMVAVVTVAHLSSFLCSLPWVCKCFHTVLt FX PYLSATALI LLFLCSFSCLFFCI 
PVLHNQE3 IDYRKLLECFRGNI LROF taWIALVPLALCSWLALOSFYLKrHLVErADr H 
TWSFLAOMFVLtVP lALILTPAVSFFFNFSFSFYLAKOEEEKALVK 

CPn_lO^O 1183901 1182843 

predicted D*«imino acid debyrogenase 

FKVHFXR lAVLCACYACLSVTWHLLLHSOCTATIDLFDPT PLCEGASCKSiCLLHAFTCK 
KAUCPPLA W tNATHALITEACKALNVP tV I .ICC I LR PA I DEDgAOLFTERVEEFPKEV 
EWWEKARCF. I S I PSMV r PPNLCALF t K.'WTLMUCLY lOCLADAtXKUTTOFYOELIEDL 
ADtEEr/DII r rvrniANA.*; ri.PELKDMf'VNKVK'^OLLCirjWPKDL.AMUJFLUNAHKYMVA 
W^yK^f^; 1 1 * IATFEHNO PECTPOPA l AYOE l MPF/UrLFKILKDAOVU Ii"YAi3<R5USK3 

in.rv u:p.t urxiMvuxiu^iKntuft r. I'nnMLAQAVt.ftKrrrAY i AKEFttT t 
.irt.>o*Af'f uitnivi)tfurlitn*: Aur. try>t r 'rr 

I KrrFMT::nTK::r:KNri:rtAiJw»4wr:r: i rfyyifFnt.r'OfjMAATAi :,m yvvt t..-wrMxn^; 
MKKCAfppFRi L::TtRrDu(wirYMY:;r'.Er;rt:i*Y r',>Tr^w;'/wu.viH iNVi'-vAvc-mDA 
r Myt-irH-fi\\ i :nti.ha iutiliha wvfi ik r vf .r: t uqm: t r mv i im tk r t r i. r i f t i l 

ij:vtx^Ar/Lt:Fixa:t.T lYi Li':;u.('Fr";M.vitor^i ritiprrrAi.-vu)i uv»^Kw;tviiwv 
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• ;l t rAvL;;:>fUiVr r ivae r PF.-jAAKNrTrFPE if .ekspsvslyitssvkjlamll 

VYFr;:;NAWTrrML3ITCn/MVT,PAYLAJAAFLFKL3K^V fYPKKSSIKAPLAMXTCILCWY 
;;LWL t Y/VJGLKYLFMALVLLALC;! PFY IDACKKKKNAKTrFAKKEIVCHTFrCU-ALTAI 
FLFLTHRtKl 

:Pn,l032 U«fil53 lUWfi 

CTJ73 hvpor.ncc tc«Jl procem 

( WAY'TTPyrvT-r^nrr^ rnr.'TDD^PPOPFCTFr/'DrTAL WAKI WFTITVP^ 

■; t v ' ;av ; .:• v • : ;aa; - :i * ; a ^ at ; : : i ■ vj* :Kuk: - ; : v />r'*v* 

.■;v::r^--fv;:::: \:r;!Ar-riv.:.:'r\::.jiK;j-J.r/: H/Kr:ri-p:r*r Fir-^T 

AJUnFLNFENAEPAKVN 

CPn_l033 U87656 IIW187 

CT372 hypochecical protein 

NNKKKDYSGEFLTrrrrVDS lAFLPSEENFCY nCTILFFRVKKKKYAFFYGErMrSFRFLL 

t^CLCALGISSYACTPKETTCHYKRYKARIOKKHPESIKESAPSETPHHNSU^PVWir 

CSHPWK0CISVSNrXTSVEKATmOlSU3rSILPCWFYPHKALCQT0ALElPS*CW 

SrrWTLYDSPTAGOCIVDFSYTLIHVVJOTNCVOANOAAGTASSHNDYS*^^ 

QTFPCDrLTlJVI00YSLYAIDCm*YDNIWSCriSYALS»ttSATYSXX;STt»YIOFTPM 

SEIKVQLCFODSYNIDCTNFS I YNLTKSKYNFYGYASWTPKPSCG0C3QYSVLLYSTRICVP 

EONSQVTCWSLNAAOHIHEKL'rLrGRINSATCTAIJINRSYVlJCS.VSENPLrnW 

rCFATNKVNAKAISNVNKUUlYESVMEAFATIGrGPYISLTPDF^LYIHPAUlPERitrS 

VYGLRANLSL 

CPn_l034 1188589 X187732 

Predicted OMP (CT371) (leader (18) peptide! 

KTSV^KYKKYLSYSILV0KrARYVMICTWLFFTrtJ*SCSSFyASCRYAEVRSIHEVACOIL 
YDEETW^ILDt^ITriXOGGEAI^HSIWKSKAIQGLOKQCn'PEOEAWEAVWFWIEI 
GTVQP r ESAiriXIEKIOKOGKTTFVYTERPKTAKDLTLKOUMLNVSLEOT 
PKNLLYTSG ILFSGOYHKGPGLDLFLEICTPLPAKI lY IDNOKENVLRIGDLCOKVGIAY 
FG ITYKAOEUiPPlYFtWIAQVQYNYSKKlXSNEAAALIXIlKaHHE 

CPi\_l035 1190081 1188570 

aroE-Shikimate S-Dehyrogenaee 

^/VQLPLKVPI'/HWIWRFSMrYYGVSVMLC^TVSGPSFCEAKOOrLICSlilLVO^ 
LINELDD0EUrrLITrA0NPILTFRQHK£«STAI*aOiaYStJUCLEPKWKDrD^ 
. LQTtRKSHPKIKLILSYHTDKNEDLDAXYNQ4LATPAErYKIVLSPEllSSEA12*YIKK^ 
LLPKPSTVLCMGTHCLPSRVLSPLISNAMNYAAGISAPOVAPGQPKLEELLSYNYSKLSE 
KSHIYGLIGDPVDRSISHI^HNrtiSKLSLNATYIKFPVTIGEVVTFrSAIRDLPFSGLS 
VTMPUCrAIFDHVDALnASAOLCESINTLVrWWKILGYNTDCESW^^ 
HIAIVGA0GAAKAIAATIJ^M0GA^OIFNR^LSSAAALATCCKCKAYPU3SLEI^ 
: I^O.PPEV^FPWRFPPrVKDI^^^CPHPSPYLEIW0KHGSLIIHGVEM^IE0ALL0FALW 
FPDFLTPESCDSFRNYVKNFMAKV 

CPa.1036 - 1191180 1189984 

aroB-Deh/roguinace Synthaie 

GYDKPCSCRSCIIPTMLOTMlSETIITrPHWKLrSNFTQKKIJSSISTAYPLVII^^ 
VQOHU/SPILDHIKMLGYOVIVLTrpPGEPWCTWETFISLQYQLV^ 
. CTVLDKrCFLAATYCRGLPLYLIPrriTAMVOTSIGGKNGINIJCIICN^ 
MCPQFLSTLPREEWYHGIAEAIKHGFIADAYLWEFUlSHSXMLFSSSOIIiiEriKIWCQI 
KAAXVAEOFYDRSUlKIUIFGHSIAHAIETIJaCGTVNHGOAV^^ 
TPOLIDOLERXXKRFNLPSTUCDLOSIVPEHUOlSLYSPElTIXYTIXTrDKK^^ 
IMIEHLGRAAPFNGTYCASPNMEZLYDILUSECKVKRHC 

CPn_1037 1192286 1191123 

aroC-Chorismace S/nthase 

LHFSRGSWlSFLEEIXRTS'/SRSHYLVKVMKNSFCSLFSFTrWGESHGPSICVVIDGCPA 

GLEUtES0FVPAMKRRRPGNPCTSSIUCENDIVQII^GVYKGKTTGTPLSWII/m3^ 

PYENSEM.YRPGHS0YTYEKKrcrVDPtOGCRSSARETACRVAAGWAEmjVN0M 

AYLSSLGSLTLPHYLKISPELIHKIHTSPFYSPLPNEKIOEILTSLHDDSDSLGCVISri 

TSPIHDFLGEPLFGKVHAUASALMSIPAAKGFEIGKCFASAQMaCSOYTDPFVMBGENI 

TLKSNNCGGTLGCrTICVPIECRIAFKPTSSIKRPCAT^nTCTKK i 'ITY R TPOTCRHDPCV 

AIRAVPWEAMINLVLADLVLYORCSKL 

CPn_1038 1192750 1192199 

aroL-Shikimate Kinase II 

WKLELRNVMTI I LCGLPTSGKSSLCKALAKFLNLPFYOLDDLIVSWSSALYSSSAEIYK 
AYGDOKFSECEARILETLPPEDALISUSOGTLKYEASYRAIOTRGALVTLSVELPLIYER 
LEKRGLPERLKEAMKTKPLSEILTERIDRMKEIAEYIFPVDHVDHSSKSSLEOASOOLIT 
LLKS 

CPn_l039 1194011 1192665 

aroA'Phosphoshikimate Vinyitransferase 

yC FTMLTYKVSPSSVli*GNAFI PSSKSHTLRAILWASVAEGKS II YNYLDSPDTEAMICAC 
KOMCASIKKTPOILEIVCNPLAIFPKYTLIDAGNSGrVLRFKrALACVFSKEITVTGSSO 
LORRPKAPLWLRNFGASFHFSSDKSVLPFTMSGPLRSAYSDVECSDSOFASALAVACS 
LAECPC3r^rIEPKERPWFDl^U^^EKWLPYSCSEy^TYSFPGSSHPC?GFSYHVTGDFS 
SAAF r AAAALLJK5L0P IRLRNLOI LDtOGDKIFFSLMQNLGAS lOYDNEEI LVFPSSrS 
tXTS IDMDGC rOALP ILTVLCCFAOSPSHLYNARSAKDKESDR ILAITEELOKMGACIOPT 
HOGLLWPSPLYf3A\^OSHDDHRIAMALTIAALYASGOSRlHOTACVRKTFPNrV0frLNI 
MEARIEECHONY^iMWSTHKRKVFARESFG 

':Pn_l040 U 94 876 11^*4073 

NO robunc homoioq present in Genebank/EMBL as of 11/7/98 
R PSOSLFLRTWC PSSSFFEHTVCAAPU,YPRRRSPDYLFSPTCCPMSTTTVKHFIKTASR 
WF tn/LKE r VAJNVWHAQW I NTLSFLENSGAKK ISA3EHPTEVKEEVLKHAAEEFRHGHYL 
KTO 1 3ft t.'^ CTJLPDYT J WJUjCCU,Tm-LHUJ3UlTCRVLENEYSLSCQTLKTAAYILV 
rrAIELRAJELYPLYHDILKEAOSK ITVK3I CLEEOGHLQEMERELKDLPHGEELLGYAC 
C'FECIELCI jOF-VERI.BOM t FDPSSTFTKF 

«;i-|i_l(Mt IlKiJOt ttt.n2r. 

• 1 1 1 0A • A* li •! u »• :v i iH» »r I u . in i r u f ■ K . Am I no - / -Ox'^nomnoo t »* 

Aitiiitt^t. c<iti;U>:f -iiif 

I 'I 'UH IK I ri .It I KKTri VITLVMFtX>r:;EPi: C;:RKYOr:iILFtJGR:;N,\LTKFOYLRYOGK 
WV: :rtf If'l'AI -\N; :i<NCA;:VMI'L>J(:0' IKPNHKDNHKLLCPTMmr^lDKOSaJH.'IfX: IWHPFT 

v; .ALorn-i * i k i :hi :AYLVAE:x.Tn VLPAt jswwi:nuk;i ighpy [tkklceoaoklehv 
I Kr\Nn'Mt; rAM:t.v::Kiw\r'[.LraiL£nFFFi:DNr;:rr:;f E tAHK tAVOYYYwonKAKSHFv 
' ;i .::nayic :i MR :.\m: : :r::i 'nvrpM0LFLP.';:7r t aapyy';kkf:u\ iaoaktvfsesn t 
/u\\- 1 YKi I -u* :ai y 'Ml k/npfj ;LKr: t lkl/\khyc m/: r aok i r.Tr :Ft irtcplfa^jeftd t 
f 1 1 J f 1 1 1 : :k. :r . I*' ? :y r .i -i At .tvttk f. t i (nAFvr^tikMKALU k :i iTFTnMPLGrnAAUWL 
1 .1 .Tu:i •( K ■( o.'i'i.+i I i'Mi mmrfofaik;:;! wfi'^nvi/rr/r jvLDYr'AivvrriYFTOYROHLN 
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RFFLERCVULftPUafTLWI. lQ£EDUitVfZllU:DAL>:i^P'^ 

cpn_io42 o r** 'nr96ff29 s n?j5934i", ; 

•bioo-dethidbicrttn ftynthetSiS 

^mSPFTYFRANFFK)RIIrVG:OTCVCKTIV3AIlJ^RAIilAEYWKPI0ACW^DSNIV 
HELSCAYCHPEAYRUnCPLSPHKAAO r Dfl\« I EESH ICAPKTTSNL: lETSCCFLSPCTS 
KRLQCDVTSSWSCSWILVSOAYLCS INHTCLTVEAMRSRNLN I LCMWNGYPEOEEHWLT 
OEIKLPI tCr^AKEKEmCT: tSCYAECWKEVWTSNHOG lOGVSGTPSLNtH 



PioF.J-Oxononanoata Synthaae.j 

PHLCQOFLIEAIJWRKSKHTYRSLSLNSHLIDFTSNCVlJCrASSPEU^ 

UCATCSRtLTCHSOLCORIEEOLAAYKNFESCU FWCYTANLCLLYALATDOORILHDL 

YIKASIYIX;iRLSKAOSFPFNHNDLNKLEKRLASSHU;RTrsrcVESVYSLHCSVAPW 

SELCTaYSAYLIVDEAHAVCVrC0C5GEXn.VSALGL0DKVlAT\r^^ 

SILKDYLINTCRPFIVTrAOPPHALTMELAYEHNORArNQREHl^ALIHHFREICAO^ 

LQL WWlU ' l P I05ICVSCSHRARQAALQI0N5CYrnrRPIVSP7VKQREElI«RICLHAra 

TKNEIOHLLHTLEOIFLCNVSSL 

CPtul044 1198700 1197699 

•bioB-Biocin Synthase 

AKHMREETVSWSLEDIRErYKTPVFELIHKANAIUlSNFUiSELCTCYLISIKro^^ 
CAYCAOSSRYHTHVTPEPMMKIVDVVERAWUVELGATRVCLGAAWRNAra 
M\«SrTOLCAEVCCALGMLSEE0AKKLYnACLYAYNHNLOSSPEFYErri ITTRSYEDRLN 
TLWnrtJKSC ISTCCOGIVCMCESEEDRIKmiVIATRDH r PESV^^ 
PPrsnreVUtriATARWFPRSMVIUJUWRArLTVEOOTLCFLACANSIFYCDXLL 
NDIDEDAQUKLLGLIPRPSFGIERCNPCYANNS 

CPIX.X04S 1199602 1198901 

•conserved hypothetical bacterial membrane protein 

C7I^P«WrSHRKTLVFSYI^STrrLU.VI^NLVLSSKLIPTrFFNriIPCGLILYPLTFl.I 

SDVVrreiFCPiaCARVMIFSAriANU.ASSIVOrnfFFPVASPEM0TAWHaJDt3PL^ 

ASUJtf'IVSQOLDrVLYTrFKNRTPNSSl«lJlSNCSTWISOIPDrFt\^^ 

FPOTlWIMrYSyiYKITFCVLTTPLFYtAVMriRKFLCMPSTKIANTVPLTO 

CPn.1046 I20067S 1199590 

•Tryptophan Hyroxyiaee 

VHYCERTUSPKYILXIALKLROSLSlJTQNSOSLORAYSTPYSyYRIlLOI^^ 
RHKCISIIXFFKNUJ^VHLLSLSKNORBSCSTOKAVVCT 

YCPRFTLDYIiAFGU^FLDHGAVlKFFELSTOrSYYPVSGFVAPHaVLSLLODKYrpr 
ASVMRTLDKtWFSLTPDLIHDIXCHVPWIXHPSFSEFFINMGRLrrK\aEICVOAL^^ 
RZQTU}SNLZAIVRCFVTrVESGLIENHECRXAYGAVLIS5PQELCHAFinnnWI^ 
DQI IRLPFNTSTPQErLFSIRHFOCLVELTSKLEWHLOQGLLESIPLYNOEKYLSCFCVL 
CQ 

CPKU1047 1200537 1201343 

dopB-OihydrodipicolinAte Reductase 

FQSRNHGSSMHVGVIGCSCRTCK\nn;SALEQSSErrLGPGPSRSSALTLFX7^^ 
DFSHPLLTKEWAKLLXSPKPLZ ZGTTGFPSXQCEAHOSLEELTHIVPWVCPMASLGAV 
IHXRLVHLLSOLCNPQFDXRIRErHHRYKKDSLSGTAQOLIITriOOVKQeDW 
RDSSlOCTIEVOSSin/GDIPCEHEVAFISSCEOILVRHT^RNmRCILSZLZMtXTtN 
PQPCLYSLCmXELVLRNEKCLLKKTTDH 

CPn-1048 1201588 1202604 

asd-Aspartate Dehydrogenase 

OGERKCMRZAVLOmSLVGOXrVALLKKWYREWVIAEVVASKSKYGOSYGDACIW^ 

PMPa!\mDLPIRKIEEVOSDrVVSrLPSSAESMEAYCLSQGKVVFSMASTYRMHSSVPI I 

rPCVN5OHF0LI^PYPGKIZTSPNCCVSCITLALAPUUCF5lJ3HVHI\mj05A^^ 

PGVPSIXtXA»nVPHIVC£EEKIUlEmZLGSSKOPLPCXLS\rnmRVP^^ 

VTFSKDVDI^EILYSYOEKMCEmrrYOLYCNPWSPOARKHLSra 

RTZtaMVLIKKLVRCAACTU^HENYFTDYLKRETCLR 

CPtuX049 1202586 1203914 

lysC-Asparrokinase IIZ 

BGNVSKZVYKFGCTSLATA£NXCLVCOZ ZCKDKPSFVWSAIAGVTDLLVDPCSSSLRER 

E£VLRKXCGKHE£IVKNIAIPFPVSTmSRLLPYtjOKLEZ50UlFARIZ*SLGEDISASLV 

RAVCSTRGWDLCFLSARSVIC.TOOSYRRASPNU)LMKAHWKOL£LN0PSYZZ0CFIGSNG 

U;ETVI^RGGSOYSATLZA£IJUUTEVniIYTOVNGIYTHDPKVISDAORIPEL5FEEM0 

NtJ^SFGAKVLYPPHLFPCMRAGIPirvrrSTFDPEKGGTW^AVDK^ 

YQ5rCSVUYTVLGCa;i£EZLCZL£SHGXDP£LMZAONNVVGrVK000ZIS0EAQEKLVD 

^.SI^SVTRLHHSVALITMZGraCSSPKWSTITEKUKJFOCPVFCrCOSSMALSFWAS 

ELAEGZ lEELHNDYVKQKAZVAT 

CPn.lOSO 1203884 1204798 

dapA-Dthydrodipicolinate Synthase 

LCKTKSYSRHVCR IHHLLTATVTPFFPNCTIDFASLERLLSFODAVCNCWLLCSTCBGL 
SLTKKEKOALIC FACDLOLKVPLFVCTSGTLLEEVLDWI HFCNOLP ISGFLKTTPIYTKP 
KLCCO I U^FEAVUJAAKHPA ILYNIPSRAATPLYLDTVKALAHHPOFLC IKDSOCSVEEF 
OSYKS I APH lOLYCCDDVFWSEMAACCAHGL ISVLSNAWPEEAREYVLNPOEODYRSLWM 
ETCRW\nrrTTNP rCIKA ILAYKKAZTHAOUlLPi^ lEOFDLEWSPAVESMUWPKUff 
VFSYS 

CPn.l05I 1204956 1205270 

No robust rvxttoioq present m Genebank/EMBU as of 11/7/08 
FFKTPKS lOOLHL I KT I DPVRKISPVnTKKSSFFROSLLRFLELFWMFLYC IRS IRFHCV 
HIATFlCRGLILFLTTLFLSMrCILHFITLPWlCKEDPRIIRKNK 

CPn. 1 052 i;054O2 120616'* 

HO robusr homoloij pres^rnc in 'ienebank/EMOL .is oc 11/7/98 

FF I0KMKYN5REK IKitAUl tCJ^YC lTVFRN>JFf;Lr;CYDKT rY::LJCYVFNCJPNS ICRCR 

::FrFFR';KKTEVETKEVK t KDETI>P:;LEr;NnfVKVAEnFPKRttAALE:;mC0:7.': tCNLCA 

trMFLDi'/jHL: :uttF:'K E iv*T:rp r FTRnK.TiK •nAGr::;EPFrcTrAa:vt ;lrjkiacsyel 
*:vtacllo':rlkov::o: ;i irtrat.';:; r l::vi k :::mvtr Ptiirrrry t vi :Ki\ft Pt*f .fffrltso 

VRRDl.KKKFRLKMrKP 

tn» r»iMi::r rwNKtlfHi (n ri:»tr.r in ':.'ii.-Uink/hMril, .i:*. n| lt/7/*iH 

KK:;::fMJiKAKCMiMiLYLTtn*:i4/;vA*:i'iurru:uw::KKA:;Mtvi.w::KKi«:[a^^ 

.■•.r;KDYAFFr.i.TARK::mi ricaci -NMTi-wnv I .jMtri : V -prrncTNi JCF^vniti j:iiwr:p 
tLr,VK.';AF 
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wo 00/27994 



PCT/US99/26923 



Pn.lOS'l 1207010 I20')4f>h 

No rotju::c homoloQ presenc in Cenebank/EMBL as ot ll/T/OB 
:;RWimfl F tMQVU^POLPP PP0H5VC3 l53PSKIJlVLAITrLVFCMIXLI3CALn^ 
pr,u->AA I SFCLC :CLSAXj3GVLMI3CLLCU.VKRE I PTVRPEEI PECVSLXPSEEPALQA 
•^OKTl^OLPKE^^LCrrDrOE\^ACI^KLKDSKVESRSFL^mAKKELR\^^V^^ 
^^aJlO:VACBC>roLW^Lr^CGR3IJWAESESLDLFKVSKRLCVLPSCOVRCBCaJ^ 
KrrVAPr/irtrj^CEtKKVAVAFORN.TYAMAEKAFAKALnALEESVyRSt.TOSYRDKFLESE 

yVRI>fy[X3EF0KAG£IU.EJCLHALVPEVSVSIRENKIQE:rBSNt£KAyEAIEENyRCCV^ 

0EDV>KEEEKREAEFRERGNKIl^PEEI£SSLE0^OHCUC^^fSEKU^aXCHI^^ 

TAEVOiKri^DAESRlXmEBVKEMPCRIEEIE3CnJlKAELPaPTKKArEKACSQW 

C^EMIXmPYCKE SLAYV TSKERLVSLDEDLRRAYTECOKRFQGDSCLESEVRACREQL 

RERIOEFETOCXOLVEKELLCVSSWJWTBCDCVSCWCEAPPGKKrfAO^ 

OSRWMTKSERIJIECVOACNKMIJCACLSEEDKVUCEE^^ 

O0RVAAF£SI£VPEIPEAPE£KF5LLOKAASLFTR£OHT 

CPn_l055 1209583 1210521 

No robusc homolo9 presenc in Genebonk/EKaL aa oe 11/7/98 

Cm.YHHSYPPPPtWSVCyUTCLSKF?VtAlTFLVU;VUXISGALrLTI^ 

FCLCIGLSALCCVLWSCIXCUJUaiE\aWlPEEIPHGVSVAPSEEPALC5AT0I^^ 

PKEtlttLDRYIOEVVSCUIKlJCDUlCEDOCUJCCAKEKLOVrDF 

OOEianrUCCLIOEMRDXGSTLFMSQVSLnCLMEWLCrfLPSCTTOGERI^ 

RRICOTRKVAMTFDRNAYGVAJerArEKArCALErcVYKSMrESYREAFCEVXTO 

EKILRICYLELRR 

CPn_105S 1210482 1211228 

No robusc homoiog pre»enc in Genebank/EMBL as of 11/7/98 

CEDIKJMIJRVEEIEMtUlVrELPLLPIKOALEKAFVQWSyXAKLTKVEPCFRESPAVI 

TSEERUJSIICrLERAYKEYOKRFOEPSRIXSEVSGCREHIJlZQVKOFCTQCL^ 

IFVSDVLFRKMVSCLVSTVHVPFMEFYYEVTmnUJttJUkOWMAJ^ 

KETLEKAKAPREEEVVrtlX:EERKSKEKRLIL^«CIEAAQORVKDLEPPPIKETC^ 

YSFTIRLKS 

CPn_1057 1211467 1213596 

CT356 hypochecical protein 

rIHFYFF^^FAMPEPLY^^aarTacSPVlIXYAHTPV^««TWCAEAFHIAAIEMCP^^^ 

GCKHSRVCQVMLOESYTKPEIAAMLNEYrS/WKVTOKEELPYVAKLYCDL^^ 

ETVSWPLNVFLTPDLVPrrS\WLCNBCKlxa^SFPOriOiaJiF>^^ 

KVIXIASFLBCCVRKEIUDESSIJanVAALYODIDPHYGGVKAFTKRUCL^ 

LEYOESRCLFFVDRSLSMVALCGVimHIGCC\r^SYTIDDKWLIPAFEKRtII»«^^ 

LEAKACLGKEEYRCICKQII^ILSELYSPEVGAr/SSEOAaWSAGOONFYT^^ 

NAIfin>AEirCDYYGISRECFFNCRNIUlXPVHREI£12^aCYHRSIEAIEDIVDRSW3I 

CJCCIRAORSHRSICnDr^TFl^^KagMlYTFAYAGRLWEVEYIEIGKKCCEFVRNSLY 

ELYRRWRBCEAKYRASLEZ>YC»I.ILGVLALYESGCGSFWLSFAmH}EVVLSFRS£^ 

FYSVTORDSTLLlKOSPI^aiETlSGflALICQCLLSIJtLrrEiaCHYLTYAEDILOI^^ 

AJmOa^SSLCLLIASOOTFSRKHVKVLIALCDQnjRSPVUaXSGIJlJ^^ 

QEHLnVLP£Y£Ka*IPKCrcTATTIYVI£VDQCKRrKI)LELFRRYl.rSL 

CPn_l058 1213742 1214836 

CT355 hypothet ical protein 

EVWXYQTUffilVtVSTGCIFUMiGCYAAEVPVTSSGYENI^^ 
FKVDEENVVTAIJJVrHKUaiJTfNSYPHLrDSFPARSOYYTAhWPVVI^^ 
AKAIOlIATDPTAVMOEIEEHFGRDI^PLYAHrHMSPNDIFNVIORTLTAQRV^^ 
VMLKVTPGKIREYTOKLEEEASRKVIWKYRVLTlKANr^^ 

KDRLTALVISOOCQLVCSEErSRQJSELSOSHROELDLIGYPKELCCLPKAHKSCYXLYM 

UJ>CTSGSIEPU>VM£SKIKOHI^AL£AESVEKQYKDRLRiCRYGYDASMIAm^E£J^ 

VFSLL 

CPn_l059 1214848 1215678 

kgsA-Oimethyladenofiine Transferase 

VTRSSPAOLSRFLSEIONKpKKSLSONFLVDONIVKKIVATSEVrPOOWVLEIGPCrGAL 
TEELIAAGAOVIAIEKDPMFAPSLEELPIRLEIIDACKYPLDOLOEYKTLCKGRWANLP 
YHmpLI.TKLFLEAPDFWKTVTVMVQDEVARR rVAOPGCROYCSLTIFLOFFAOIHYAF 
KVSASCFYPKPOVOSA\^KMCVKETLPI^DEEIPVFrrLTRTAfQORRJCVIJWrtJC^^ 
KE0VEOAIJC£LGUi^A^^PEV^^LNDYLALFHKMOAC 

CPn_l060 1217694 1215727 

dxs / tkt -Transketoiase 

YKRFLYIHITKVKTSSSCPI^Lr LSPADLKKLS ISOLPCLAEEIRYRI ISVLSOTCGHL 
SSNI^I\^TIAU<YVFSSPK0KFrFDVGH{7rrPHKt^TGRNNBCFDHIRhroN^ 
PrESDHDLFFSGKACTALSlJVLGMAOTTPLESRTHVIPrLGDAAFSCGLTLEAlJWISTD 
LSKFWrL^^»WMSISK>^^GA«SRIFSRWU^HPATNKLTKOVEKWLAKIPRYGDSL^ 

OlI)0AO^WPAKyHC^^tANFNKRESAKHLPAIKPKPSFPDIFC0TLCELCEVSSRUfA^•P 
AMS IGSRLECFKQKFPERFFOVG I AECHAVTFSAG I AKACNPVrcS I YSTFLHRALONVF 
HDVCM0DI.PVIFAIDRACLAYGDGRSHHGIYDMSFLRAMPOMIICOPRSOWF0OU.YSS 
LHWSSP3AIRYPNIPAPHCOPLTGDPNFLRSPGMAETLS0GEDVLIIALCTLCFTALSIK 

FK\raiLNFAIPCrrFl^HGSKEALTKSIGLDESSKnmaTHFNFRSKK(yrrci>VRV 

CPn_l06l 1217932 1217666 

CT330 hypothetical protein 

FGSU^VEIHHKDPSUCKLFALOOSLETlJJSr^DrVATYEAMFSLIYBCUWVLRKOOLCY 
LLSVNSKGELUGPSCOPIVQTFPrHPHH 

•JPn,10rt2 1219835 IJ1915'> 

xseA-Exo^ioxyriDonuclease VI [ 

R(;FPVM:;.';PrOAVAStTER IKTLLESNFCOI IVKCELUNVSL0Pj;CHLYFCIKOS0AFtW 
*:AFF1IFr.;;Kr.XriKPKIX:0AVIIl((lKLAWAPRG0Y0IVAIIALVYAGEC0Ll>'jKFEETKR 
la.TAEGYFATEKKKPLPFAPOC tCV I TIJPTGAV IQD I LRVLGRRARNYK t LV^PVTVOCN 
::/\AllEr..KAtEVI4hlAENLADVLrTARGCCJlEDLWAFNEErLVKA£HA;rrCPrVSAV;HE 



toij-Triosepnospndt'; er'is* 

FCRE3MR EKFRENKERKMTPC. - UJNMKKHKTrCEAKEWCTLAJLLwGEPI^TCTrClA 

SPFT3LRAIHEVIt#rpr)*PTJ<LCTOr|(HI^K^CAniXtj"^ 

H trCESDAF r ASKVKSVAOACLVPVWVGKLEVRrat^ 

FL r AYEP^Atf ArCTCKVAEASDVOD tHMFCREWAERFSEATAEEI J aYCGSVKVDNAQR 
FC0C3DVDCU.VCGASL£C0SFFEVAKNrNV 



12207 1220<»'»5 



CPn.1065 1221140 1220928 

^^o robust hoaolog present in Cenebank/EMBt, as of 11/7/98 

RHRIX»HRRTSOPCFLFYF3IPEESLPPOSCRUJ0KPKHEHLPS ILUCKP I IDYUtrrSI 
YEXAIFKrCLP 

CPxv_1066 1221132 i:21468 

No robusc nomolog presenc in Genebank/EKBL as of 11/7/98 

DirjqCVTSVCAVArCICCIXI^rSTWUjGKKLDAKEftXPA AFP^ ^ 

CPa-1067 1221675 1222292 V 

dec -Polypeptide Oefonnylase 

rQVLVVRDFFTELC0AHVC7n<IRRlXYYCSPrLRKKSSPIAElT0ErRNLVSDMC&^ 
HRCVCLAAPCWSKNVSUT/MCVDRETEDCELIFSESPRVFINPVLSDPSETPI ICKBEXL 
SIPGUUSEVFRPOKrrVTAMDLNCKIFTEHLECFTARIlHHETDHUC'/LYIDLKEEPKO 
PKXFKASLEKIKRRY^m{LSK££LVS 

CPn_1068 1223267 1222365 

mhB-Ribonuc lease KII 

MSCMPPPFvvTL^TSAON^^JWLKEK^ffrFsoPO^^vTOARS^m^^c^ 

KCSEErrEFTl£PEIUrrrrHARVEODLRPRtX;VDESCKCDFFGPLCIAAVYASKAEILK 
KLYEimVODSWilJCITncIASLARIIRSLC^nCDWILYPEKyNELYCK^^ 
TVIfffEtAPKPACD\^AISIX}FAASEVTIXKALOKKEn)mrQKPRABODV^^ 
ROAFVQSIQKIXBQYOVOLPKCMGFWKAACREIAKORCKELLAKISI^^ 

CPn_1069 1223507 1223941 

yfgA-KTH Transcriptional Regulator 

VIMOEHlHKEUJlL3EIFRSSRESOSLSIiCDVEAATSIRYSCt£AIE0(XU:KLrSPW 
QGFIIQCYATYUaJGDSIl^BfPYVMKIFTCErSDHMi mr.Tn 

NLUWAGLI Z ZCSGIKVMWLGSLTSZF 
CPa_1070 1225523 1224144 

No ro bust h oaolog present in Genebank/EMBL as of 11/7/98 

RRSI>frrPCGNCNCTflfReiTPPNP0GEDZPLQB30QSCSQQSRVITQQ P CTO Sft £>CXSL 

GSlKVUJWIOACSLLNNIXDSAWCRLQfYCYRTCT 

ErW»PrwPSAOFtXX3U0QYCPXCVGKSPQ0U»HCT0KIE0CEPLCrxa)KO 

HREtJLKAAOPRCK;ESLVKLLON^CLGEDM0(OTPWSULOAVSEa^ 

VrtLOPBQOPCPPPPTDEBOlJOGAVOGAPAPaOKKHPAOECRVTCKLNFRTIiQia^^ 

LSLESGYKCPLCXJAAKOrVDLXiaCSIJCRLVASDLATrWPCXCLSl^JVI^ 

SKGYU>U)PLHPEQTVLDPRVOGPWRXLRJCVLVTrTAGENIWRQTa:^^ 

WDDDEZERTCIVTOOGFXJIPCOCZJOWKLPTEKRPNRWL 

CPn_1071 1227336 1225885 

Wo ro bust homolog presenc in G enebank/EMBL as of 11/7/98 
KgrowCPNWSWFRMCGNFNCEWVE V 1 1 ' l LL ' l * rR QSAgargFPA/7g gij r;j^ x I ' iUHlM 

TKVEKRVQFNTAOCDESTrHMIOEACELVOSZl^HRRTQCCTEYCynSYATCXGORaK 

CRLICCTYKACCUJREOCVAGLVHECBOTTOPXAVAIJUICl^ 

OKNErR0HCSEAKTQLYCTM0Si:OT*rFLBCVNSXRERGLDDSLVOAVLSFXATIlSWEKT 

ZESEEASCTSSASNSTRZPACYILOTSPLTrSRLSCCSWJARRPSSVCAEPOYVAiaCYND 

NGMAR0LCKX0\rmiJCTCDrSALCPFClI.ZVKKLNSFU^0SrSSILIOf^ 

PNFRDIVVLI2^LAICYCPA^m)ETSVVDIHMZDDPr^f^IFYRWYSYRTCKTSASrLIaCK 

PSLVROESUX:PTPAESWUlSSlXEEDDCTDDEDG*aAYQ0RIt^SCHLffrLnjCIK 
INKE 

CPn_1072 1227924 1228835 

NO robust homolog present in Cen«bank/E«3L as of 11/7/98 

KKDVrLHANV^CWKOMLKIOKKRhfCVSWrrVCAIVCFFNSAOAAPKKKKI P lOI^ 

KVSSYUCNEDASTIFCVrnTORGLLOHRYLGSPGMOETRRROLFKSLEIJOSVGNERl^^ 

LAIOIFRNKECLESErPBOMEAILANSSALVLCISSrCITCIPATLKSLLRONUSFOKRS 

lASESFLUCIDSAPSDASVFYXCVLFRCETAIVDALSOLFAOLDLSPICKIZrLCEDPEW 

OjWCSACIGVOWFLGLVYYPAOESIJ'SYVHPYSTATELOEAOGLQVISDEVAOmW^ 

CPn_l073 1229011 1229832 

Predicted OMP (CT37i) 

MRRYLFMVLALCLYRAAPLEAWI K ITDAOAVLKFAREKTLVCFNI EirrwrPKQHVCOS 
AWLYNRELDUOTLSEEOAREOAFLEWMCISFLVDYELVCANLRN'/LTCLSLKRSWVLCI 
SOR PVHL IKNTLRILR5FN I DFTSCPA ICEO^Jl^HPTKOrrFOOAMAI EKNI U^SUC 
NCOPHDAALEVLLSCISSPPSOIIYVDODAERUlSIGAFCKKANIYFrCMLYTPAKORVE 
CYNPKLTAIOWSQIRKNLSDEYYESLLSYVKSK 



RNA SECTION 



WRWUrTLICRRLMYOKE 



^:y^lKIm'^KlIAl(^^vLEcvu^JIlvoKf.ELu:RRu:r^^;^ElJ^LONfJKrAv,\NVKETLAT^L 



u :ea I Lr/TN t R £':kl t K( ; 



ninWiA n»M'M I tM074 

RilifMiit. i. ■.(::.• I- UNA i.llV I IJ 

I*..; rKMA io(i:;n.: 

.: t:; iHna ina.Mis iciir..:7» 

t«NA ((»(»•. ri: Kmi'.mi * 
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51 

tRNAS 



CRNA • Beqin End P/p« Codon 



1 89657 89728 Thr CGT 

2 ac***?* 91070 Tzv CCA 



•3 


2!#6075 


236147 


Vdl 


TAC 


6 . 


2961SI 


296224 


Asp 


CTC 


7 


409848 


409922 


Pro 


TOG 


9 


462141 


462214 


Arg 


CCP 


9 


672236 


67231B 


Leu 


CAA 


10 


677264 


677337 


Arg 


TCC 


11 


739403 


739486 


Lau 


CAS 


12 


781610 


781680 


Gly 


TCC 


13 


784822 


784896 


Glu 


TTC 


14 


784922 


784994 


Lys 


TTf 


IS 


836119 


836191 


Ala 


OOC 


16 


. 843926 


843999 


Pro 


OGC 


17 


877400 


877473 


Arg 


ACC 


18 


1085605 


1085676 Gin 


TTC 


19 


1142034 


1142118 


Ser 


TGA 


20 


1175863 


1175944 


Leu 


TAG 


21 


1230028 


1229942 


Ser 


CGA 


22 


U37462 


1137389 Val 


GAC 


23 


1030603 


1030533 Cys 


GCA 


24 


1000022 


999949 


His 


CTC 


25 


961607 


961536 


Gly 


GCC 


26 


807413 


807341 


Arg 


Tcr 


27 


786780 


786708 


Thr 


CCT 


28 


715971 


715889 


Leu 


TAA 


29 


708441 


708354 


Ser 


OCT 


30 


680259 


680178 


Leu 


GAS 


31 


631445 


631373 


Phe 


GAA 


32 


626987 


626901 


Ser 


GGA 


33 


293477 


293405 


Thr 


TGT 


34 


293399 


293317 


Tyr 


GTA 


35 


269142 


269070 


Ala 


TCC 


36 


269065 


268992 


He 


GAT 


37 


164389 


164318 


Asn 


GTT 


38 


87523 


87450 


Mec 


CAT 
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What is Claimed is: 

1 . An isolated nucleic acid encoding a C pneumoniae protein as set 
forth in Table 3. 

5 

2. The isolated nucleic acid of Claim 1, wherein said nucleic acid has 
a nucleotide sequence of an open reading frame in SEQ ID NO: 1. 

3. A probe comprising a hybridizing fragment of an isolated nucleic 
10 acid according to Claim 2. 

5. P\Xi isolated nucleic acid that hybridizes under stringent conditions 
to the nucleic acid sequ mce of Claim 2. 

15 6. /ji expression cassette comprising a transcriptional initiation 

region functional in an expression host, a nucleic acid having a sequence of the isolated 
nucleic acid according to Claim 1 under the transcriptional regulation of said 
transcriptional initiation region, and a transcriptional termination region functional in said 
expression host. 

20 

7. A cell comprising an expression cassette according to Claim 6 as 
part of an extrachromosomal element or integrated into the genome of a host cell as a 
result of introduction of said expression cassette into said host cell, and the cellular 
progeny of said host cell. 

25 

8. A method for producing a C. pneumoniae protein, said method 

comprising: 

growing a cell according to Claim 7, whereby said C. pneumoniae protein 
is expressed; and 

30 isolating said C pneumoniae protein free of other proteins. 
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9. A purified polypeptide composition comprising at least 50 weight 
% of the protein present as a C pneumoniae protein comprising an amino acid sequence 
of claiml. 

5 10. A monoclonal antibody binding specifically to the polypeptide of 

Claim 9. 
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Contig463 
Length: 273254 . . 

1 ATTGTTCCTG TAAGAACACT TCCAAAGCGC ATTTAATCAT TTTTAGTAAA 

51 AAATAAAAAT ATACTTTTAA ATGTTGAGAA AATTTTTAGC TAAACTTTAT 

101 AAAGGGTTGT TGGTGAAACC TTTGGGTTAC TCCTCAGAAC GACTTTGTGA 

151 TTCTATAGTA TTAAAAGGAT CTTGGAGTAT AACAAGTAAA GATCTTTGAG 

201 GATAGCGTAG GGCCGTATTT TGAATAGCGT CCAATAAAGC GCGTTTGCAA 

251 AACGCTTGAG TTTGGTTGTC CCAATAGAAA GTGCCTTCTT TAGGAAGAAT 

3 01 CTCTTCTGGA GGCACTTCAT AGACCGAAGT AAAGAGAGGA AGAGCAACGA 
351 TTGCTGCATG ACTTTCTATA GCTGCTTTAA GGCAGTTCTC GTACGCTAGT 

4 01 AAAGCTTGGC GATAATATTC TTGCTGATTT GGTAACTCTT CAGATTTAGG 
4 51 GCCGCATACG TGGCCTAAAA AGGTCGGAAG AATACTTTTC TTTTCTGCAG 
501 AGCTTAAATT TAGATTAAAC GTTTGATCTA GAGCTTCGTT TGGAAGTTTT 
551 ACTACTCTCA CTTCGGTAGG GGAAAAGGGG TCTTCTCCTT TTGCGGGACC 
601 CCCTTCGCGT TGCTTGCATG TATCCCACAC GCTTTTATCT TTTAGGGTGG 
651 AGTAAAGGAT AGTAGAGAGG TTCGTTGCAG TGTTGTCGAT CAGATTCGTT 
701 GGGCCTACGG GATTAAAGAT GATCCCTGTG GATTGATTTT TTTCGATCAC 
751 TCTAAGTCCA GTTAAGAAAG TAGGCTGAAA TGGTTGAGAC GCATCTGTTT 
801 GTATCGCTAC CTTGAACTTA GGGTTCAGGT GATTATTGTA AAATTGCATC 
851 TCGTTTGAGT AGCAGTCTAC GTTTTTTTCT TGCCACGCTT TTCCCAAAGG 
901 CTTGAAGTTT TGCTCTAGAA CTTTCTGCCA GTTAGAAGAT ACCTTTGAGG 
951 TCATTTGGTG GTAGACTAAG AAGGTTACAA CTGAGAAGAG GGCCGTGGTA 

1001 ATGAGAAGAG CCAAAAATAC AGGGTTCCCT AATACTATCG TTAAAGAGAT 

1051 TCCAGCCACC AAAGCTCCTA AAGCTAAAGA AGCTAGGATT GCA.AGAGTGG 

1101 ATATTTTTGC TATGGTAAAC TGTTTTTTAG GAGCAATTTC TTTATCCCGA 

1151 GGCACATAGG ATAGTACAGA AACTTGAGAG CTCTCAGTAC GTGAGGGTCC 

1201 TGACATAACA TTTTTTTTGT AAAATACTTT CTATAATTTT AACATATTTG 

1251 TGTTTATCGA TCCGAGAAAA TTGGAGAGTG AGAGCGCATG TCTTGCAATT 
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13 01 TAGAATGATC GGGGACGACA 

13 51 TGGGAGCAAA TATAGCGAGA 
1401 GATAAAGGAG CACAAAGGGT 

14 51 CAATCAGAGC TATCCAGACG 
1501 AGCTTGAAAT TATAGCGAGG 
1551 CGTTCCATCA AAACCGGGAG 
1601 TAAGTTTTTT TCGGAGCATA 
1651 TCACGGAGAT GTCCTGGACT 
17 01 TTTGAGAGGA GAACTTTTGA 
17 51 CAGTAGGTTT TTTCAGGAAA 
1801 AGTTGAAGGG CTTGTGAAGA 
1851 TTTTAAGTTT TCGCGGCTTC 
1901 AATCGATTGT AAAGAAATTC 
1951 GGGGTAGAGA ATGCGGGCTG 
2001 TGCCTTCAGC GATTTCTTGA 
2051 AAAGAGAGAA GTCTTTCTTT 
2101 TGTTTTTAGA GTGGCCTTAG 
2151 GGTGAGTGGA TGAGGGACTT 
22 01 AATTGAAGAG CACGCTTATT 

22 51 AGTTAGAATA GGCAAATAAT 

23 01 TCAATCCTAA GTAGAGGGTC 

23 51 CTAAGTTTTT GTACCTTCAA 

24 01 CCCTTGGTTC TTCTTGATCA 
24 51 CTAGGCTGGT TTTTTCTGGA 
2501 TTTAGAAGTC CTTCCTAAAA 
2551 GTGGAAATAT TTACAAAGGT 
2601 TTTAGAAAAT GCTTCGGGGA 
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TCTAGAGCTA TGTAGACATT GCGTGCGTAG 
TATAAAGTAT AAGGGAATTG CTGTTAGGAA 
GGATACATAG CCCAATAGCT ATGGTGGTAG 
AGTGCAATCG CAATAGTAAC GAAGAGGGCA 
ACGAGTAGCT GGGGGAAATA GAGAGGGAGC 
TAGCT'GAAGA AGCCATAAAC TATTAAAAAT 
AAGCATTTTA AAGTAGTGGG GTCTTTTTTG 
TCCCAAGCGT TTCTAACAAA GATACCTGCT 
AACTCCTGCA AGGTCATCCT TCCTTGGCAC 
TCGCGGAAAG ATTTTGGCGA AAGCTCTTAC 
TAATTTTTTT GTATTTAGAT AAAATATTTT 
ACTGTGTAGT GAGGGTAAGA GAAAAAAGTA 
TCTGATCACA GGTCTTGCGA GGGAGGAGTC 
AAGAGGGCAG TGCCTGTAGC AATGGGAAGA 
GTTGCTACTA CGTACTTCAC TTGTCCAAGG 
GGTGGACTCT CCTATACAGA GGTAGGTCTT 
AAAGAAGAGA AGTCATTCTG GAAAGGAATA 
GTGAGAATCA CATGGGTTGC TTGTGGAAGG 
TTGTGGAGTG CTTTTTGCAT AGGGGAAGAG 
GAGCTTGGTA TTTACGAGCG GTTTTTTGAT 
ATGAGTCTTT TCGGGTAAAA GGAAGGCTGC 
AGGGATATAT TGAAAATAAT TTTTCTTTTT 
TGCGTTGATT GACATTTTTC ACTTTGAAGG 
CTTAGAGGTT CTCTCTATTA AGGCTTCGTC 
GTTTTTGAGA AATTTAAGAA ATTCGCAATA 
GGTTGCGGTG GTTTCGTTGT TGCATAAGTT 
AAAAGGGACA AAGTTTAGCT TCGACAGCGT 
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26 51 ATTTAGCAGC TCTTGACCAT 

27 01 AGAATCATTG ATGAGTTGAA 
27 51 TTCTGAAAAC TATTCTTCAC 
2801 TCACAGATAA GTATTGTGAA 
2851 TGTGAAAATG TAGATGCTAT 
2901 ACTTTTTGCT GCGGATTGCG 
2951 CTAATTTACT GGCAGTAATG 
3 001 GCTGTCAGTA AGTTAGGTTA 
3 051 ATACACTCTA CTTAAGGCTG 
3101 CATTAAATTC TGGAGGCCAT 
3151 ATGTCTAAGC TTATGCGTTG 
3 2 01 GTGTTTTGAT TATGCAGAGA 
3251 AGGTACTGAT CGCAGGATAT 
3 301 GTTTTAAAAC AGATTGCAGA 
3 3 51 GGCGCATTTT GCAGGCCTAG 
34 01 ATCCTATTCC TTATGCAGAT 
34 51 CGCGGTCCTC GCGGGGGATT 
3 501 TCTCAATAAG GCGTGTCCTT 
3 551 TAGCTGCTAA AACAGTGGCT 
3 601 AAATACGCTC ATCAGGTTGT 
3 651 TTTAAGTCAT GGGCTACGTC 
37 01 TGGTGATTGA TTTAGGTTCT 
37 51 ATCTTGAGTT 'CCGTAGGAAT 
3801 TGCTATTGGT AAGTGGGACA 
3851 TAACGACTTT GGGTATGGGT 
3 901 ATTGTGAAAG TATTGCGAAA 
3 3 51 TTCTAAGAAA AATAAAGGGG 
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CTCTTAAATG CGTTTCCTTC CATTGGGGAG 
GAGCCAGCGT TCCCATTTAA AGATGATTGC 
TTTCAGTGCA GTTGGCTATG GGGAACTTGC 
GGAAGTCCCT TTAAGCGTTT CTATTCCTGT 
TGAGTGGGAG TGTGTAGAGA CAGCGAAAGA 
CTTGTGTTCA GCCTCATTCT GGGGCTGATG 
GCCATTCTCA CGCACAAAGT CCAAGGCCCA 
TAAAACTGTA AACGAATTAA CAGAAGAAGA 
AAATGTCTTC TTGTGTTTGC TTAGGACCTT 
TTGACCCATG GGAACGTACG TTTAAATGTG 
CTTCCCCTAT GATGTCAATC CGGATACGGA 
TCTCCCGGTT AGCTAAGGAG TATAAACCTA 
TCTTCCTATT CTCGAAGATT AAACTTTGCA 
GGATTGTGGA TCTGTCTTGT GGGTAGATAT 
TTGCTGGGGG AGTGTTTGTT GATGAAGAAA 
ATAGTGACAA CAACAACGCA TAAGACATTA 
AGTTTTGGCA ACTCGAGAGT ATGAAAGCAC 
TGATGATGGG AGGTCCTCTA CCTCACGTGA 
TTGAAGGAAG CTCTCTCTGT GGATTTCAAG 
AAATAATGCT CGTCGATTAG CAGAGAGATT 
TTTTGACGGG AGGAACAGAC AACCACATGA 
TTGGGCATTT CTGGAAAAAT TGCTGAAGAT 
TGCTGTGAAT CGGAATTCAT TACCTTCAGA 
CTTCAGGTAT ACGTTTAGGA ACCCCTGCAC 
ATCGATGAAA TGGAAGAAGT TGCAGATATT 
TATTCGTTTA AGTTGCCATG TTGAAGGGAG 
AACTTCCTGA AGCCATAGCG CAGGAAGCTA 
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4 001 GAGATCGTGT TCGCAACTTG 

4 051 GATTTAGAAG CTTTAGTTTA 

4101 GGGAAGTTCA TAAATTACGT 

4151 CGCAGAGTAT TTTTCTCAGA 

4201 AATTAAAAAG CTTTGGTATT 

4251 TTTTTGTGAT CAATAGTCCT 

43 01 TGGGATCAAA TTAAAATGTT 

43 51 GTTGGCAGCT TCTATGGGCT 

44 01 GGAGATTTGC AACTCCTCAT 
4 4 51 GGTGGACCGA TTACCGGTCA 
4 501 GATTTTAAAA ACAAAAGCTC 
4 5 51 ATCA.2^CCTCG AGATATCATA 
4 601 ACAGCCAACG AAGCTAAGGA 
4 651 CTTCAACGAT CTCTAAATAT 
4 701 CTTCTTGGTG AAACACTTCC 
47 51 AGAGACGAGG GTTGATGGTT 
4 SOI ATGCGCAACT CATTATTTTT 
4 851 GGTAACGGCT TGCGTTGTGC 
4 901 ATCGGACATC TCTGTATCTA 

4 951 ATTCTTGGGA TCGTGTGCTT 

5 001 TCTGTTCATC GATTGGAGTC 
5051 TTGTATCCAT ACGGGAGTGC 
5101 CTACTTTAGA TCTTTCTATC 
5151 TTCTCTCCAG ATGGGGTGAA 
5201 CCAGTTGCGC GTTCGTACTT 
5251 CTTGTGGAAC AGGGGCTCTA 
5301 GGATGGAAGG AGTCGATCCA 
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TTGCTGCGTT TCCCGCTCTA CCCTGAAATT 
GTTAGGAGAG ACATTATTTT ATGGCAGACG 
GATATTATAG AAAAAGAGTT ATTGGAAGCG 
GCCTGTAACA GAGAAAAGTG CTTCCGATGC 
TGGAATTAAA AGATCCTGGA AAGCCTATAG 
GGGGGATCTG TGGACGCAGG TTTTGCTGTT 
AACCTCACCC GTCACTACTG TTGTGACAGG 
CGGTATTGAG TTTATGTGCA GCTCCTGGAA 
TCTAGAATTA TGATTCATCA ACCTTCAATA 
GGCAACCGAT TTAGACATTC ATGCGAGAGA 
GCATTATAGA TGTCTATGTA GAGGCGACAA 
GAAAAGGCTA TCGATAGAGA TATGTGGATG 
TTTTGGTTTA TTGGATGGCA TTTTATTCTC 
TTTATCTATT CTGGAGCAGG AA.ATCGTTTC 
TGAGGTTGAA GATGTTCGGT TCTTATGCCA 
TTTTATATTT AAAGCCCTCT TCTTGTGCTG 
AATTCCGATG GATCACGTCC AACGATGTGT 
GATTGCTCAC TTAGCTTCTC AGAAGGGAAA 
CGGATAGTGG TCTATATTCA GGATATTTTT 
GTAGATATGA CTCTCGCAGA TTGGAGAGCT 
GCGTCCTGAT CCTCTTCCCA AAGAGGTCGT 
CTCATGCTGT CGTAATTCTT CCTGAGATTT 
TTAGGTCCTT TTCTTCGCTA TCATCAGACC 
TGTCAATTTT GTTCAGATAC TGGGACATTG 
ACGAACGTGG AGTCGAAGGG GAAACTGCAG 
GCTTCTGCTC TTGTTGTGTC AAACTCCTAT 
AATCCATACT TGGGGTGGAG AGCTTATGAC 
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53 51 TGTGAGTCAA AATAGGGGAC GGGTATATCT TCAGGGCTCT GTAACTAGAG 

54 01 ATTTATAATT AGATGTGATT TTTGATTTTG TCATGCAAGG ATTTTAAAAT 
54 51 CTTGTTTAGG GATAGATCTT GCTCTCTAAC TGGGATTTTT CTATAATCGT 
5501 AATTTATGAT GACGTATCCT GTACCACAAA ACCCACTTCT TTTAAGAATC 
5551 CTTCGTCTTA TGGATGCATT CTCTAAGTCT GACGATGAGA GGGACTTTTA 
5601 TTTAGATCGT GTTGAAGGGT TTATTCTCTA CATAGATTTA GATAAAGACC 
5651 AAGAGGATCT AAATAAGATT TACCAAGAAT TAGAAGAGAA TGCCGAGCGG 
57 01 TATTGTTTGA TTCCGAAGTT GACGTTTTAT GAAGTAAAAA AAATCATGGA 
57 51 AACGTTTATC AATGAAAAGA TTTATGATAT CGATACCAAA GAAAAGTTCC 
5801 TTGAGATTTT GCAATCCAAG AATGCCCGTG AGCAGTTTTT AGAGTTTATT 
5851 TATGATCACG AGGCAGAGTT AGAAAAGTGG CAGCAATTTT ATGTAGAGCG 
5901 TTCTCGAATT CGAATTATAG AATGGCTTCG CAATAATAAG TTCCATTTTG 
5951 TCTTTGAAGA AGATCTAGAT TTCACAAAGA ATGTTTTGGA ACAGTTGAAA 
6001 ATACATTTGT TTGATGCCAA GGTGGGGAAA GAAATCACTC AAGCGCGTCA 
6051 GTTGTTGTCG AACAAAGCTA AGATTTACTA TTCCAATGAA GCATTAAACC 
6101 CTCGTCCGAA ACGAGGCCGT CCTCCGAAGC AATCTGCTAA GGTAGAJ^CA 
6151 GAAACAACAA TTTCGAGTGA TATTTATACA AAAGTCCCTC AGGCTGCTCG 
6201 TCGTTTCCTT TTCTTACCCG AGATTACTTC ACCCTCTTCA ATTACTTTCT 

62 51 CAGAAAAATT TGATACGGAA GAAGAATTTC TTGCTAACTT GCGCGGTTCG 
6301 ACTCGTGTTG AAGACCAGCT GAATCTTACC AATCTTTCAG AGAGGTTTGC 

63 51 TTCTCTTAAA GAGCTTTCGG CTAAGCTTGG TTACGACTCT CTTTCTACTG 

64 01 GAGATTTCTT TGGTGATGAT GATGAGAAAG TGGTCACTAA GACGAAGGGG 
64 51 AGCAAGCGAG GCCGCAAAAA ATCTTCTTAA TCTTCTATTT TGTGAAGTAG 
6501 TTTATTTTTA GACGCTGTTC TTATTGCTTC TTTACATGAT CTTATTACAA 
6551 ATCTTTCTTA TTTCTATTTA TTGTTTTGTT AAAATTTTAA CAATAGCTAT 
6601 TTATTATTAG TCATTTTTTT AATTAAAAAA CTGTTAAAAT TTTTAAAGCT 
6651 AATTTAAGAA ACAGTGAATA GTTCATCATG TCATCACTAC TGAGCTGCGG 
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67 01 AAGAATAGAG CCGACTCGGG TTACCTGTAG CTTAAAGACG TATCTTGAGG 

67 51 ATACGAGTCA GAATCAGTTG AGCACACGTC TAGTTCGGGC AAGTGTCATC 

6801 TTTTTATGCG CATTGTTGAT CATTTTGGTT TGTGTGGCCC TCTCTAGTTT 

6851 GATTCCAAGC ATTATGGCCT TGGCGACCTC TTTTACGGTA ATGGGGTTAA 

6901 TTCTTTTTGT GATGTCACTT CTTGGTGACG TTGCAATTAT AAGTTATCTT 

6951 ACTTATAGCA CTGTTACGAG TTACCGGCAA AATAAGAGAG CTTTTGAGAT 

7 001 TCACAAGCCC GCTCGCTCCG TTTACTACGA GGGGGTCCGC CATTGGGATT 

7 051 TAGGACGATC ATCTTTAGGC ACAGGCGAGA TTCCTATAGT AAGGACGTTA 

7101 TTCTCTCCAT TTCAGAACCA TGGTCTTAAC CATGCCTTAG CTGCTAAAAT 

7151 TTTCCTATTT ATGGAGCATT TCAGCCCTGA GCCACCGAAC GAGCCTTTGG 

72 01 TGGATTGGGC CTGTTTGATT CGGGATTTTA GGCCTCACGT CAGTTCTTTG 

72 51 TGCTTTGTTA TTGAAAAACA AGGGTCATCG CTGAGGACTA AGGAAGGCAA 

73 01 TACGATTTGT GAGGCTTTCC GCTCTGATTA CGACGCCCAT TTTGCTATGG 
7 3 51 TAGATTGCTA CCGGTTGATC CACTCTAAGT TGATTATAGA GAAAATGGGA 

74 01 TTGAAGAATA TCGATATCAT TCCGAGTGTC ATGGTTCGTG AAGATTATCC 
74 51 TAGCCGTCCT GGGGAGGGCT ATCGCGAAGG CCTATTACGT ATGTATGGTG 
7 501 GCAAGGGGGC TCTGTGACTT CCCTACTTTA GTTCCTAATG AGCGCTTGCC 
7 551 CATAGGGCCT TTCTTTGTCC CGCAGCACAC TTCCGGTGCG AAGGGTAAGG 
7 601 AGTTTGCTAA AAGGAATTTT TCTATAATTT CGGGATTGGA TGACATATTA 
7651 AAATTATGTA TTCTTCAAAG GCGTCCTTTT GCTTTGCAGT GGGATAACCT 
77 01 CTCTGTGAAA AGTGATTATG AGGAGGCTGG GCCCGCTATT GGGATACGTT 
77 51 CTCTTGAGCC ACAAGTTTCT CAAATTTCTC CAGCCCACGG CCGGCTATGT 
7 801 AGTACTTTGG TCCAGTGGGC CCCTATCCTT GGTTCTGAGG AGCAGCTAGT 
7851 TTGGTTAGAA GAAACAATGA AGCGCCTAAA GTTTCCTAAA AGTTTAGGTA 
7 901 GTAAGGACGC TGTTATTGTG GATTCGGAAA TGGTTCCTGT GAACGCCAAT 
7951 CCTACTCAAG AGATACCTGC AGCTTCCGAG ACTGTAGAGT CTTCACCTGT 
8001 AGCTCCAGGG AATACAACAG ATACCATGCC TGCAGCTTCG GGAACTACAG 
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8051 ACACCACATC TGGGGTTTCA GAGGCTGCGG CGGCTGAGGC TGCCGTGGAT 

8101 TCTACACCAG GGACAGAGGA GGAGCCGAGT TTTTCTCTGA GGTATGCGCT 

8151 TGTAGTTCAA AATGTTCCCT ATCCAGAGCC GCCTAAAGAA CCTGAGGTGA 

8201 TGTTTACAGA TGAAGAAAAA AGTCTGATTT TAGAAGCTAC TCGTGCGCGT 

8251 CGTATGGAGT TGGACTTGTA TAATGGCTAT TTAGCTGATT ATGAACTTTC 

83 01 TAAGGATGAA ATACAGAAAC ACGTTCCTGA TTTACCTGAG AATTGGCGTA 

83 51 CGAATTGGCG TTGGTCGGAG AGGCTCTATA AATTTTTCTT TAAAACAAAG 

84 01 AAAGAAGGAT TAGAAGAAAT TTTCTTAAAC AAAGAGTTAG GGAATATGAT 
84 51 TCTTGCCCGA GGGCTGGCGG CAACTCAGTC ACAAGCACGT ATTAAAGTAT 
8501 TCAATTCTTT AGTGGCATGG CTCTTGCAAA GCTTTAACGT AGGGAGGAGC 
8551 TGTACAGCTA AACCTCTTCC TACGTCAAAA CTAGACCTCT TTAAATCGGA 
8601 ATTCGAGTCT AAGCCTAAAA ATAACATCTT AACGGAATTT TTGGTGGCCT 
8651 CTGATGAGGA GATTCTCTTT AAGGGGCTAC GGGTCCTAGA GCCTGGAATC 
87 01 GAAGGTTGGT ATGACCATCC TGATCAAGCT GGAGAGATTC GGTCGGTACT 
87 51 CGAGGGTCTG GTGCAGGCTG GACGTATTTC TGGATATTGG GAGAATCAGC 
8801 CGTTTGGGAG ATTTGTCCTT AGAGGAGTTG GTGAAAGACG TACCGAGCTT 
8851 GTAGAGCTTT TGGAGAGTTT AGTTGCTTCT GGTGAGATTA TGCAGTTCTT 
8901 TGAGTCTTCG GATGAAGAGG GTGCTTTTAT TATCGATAAC GAACCTAGCA 

89 51 AGACTGCTAT GCTAAAACAG CGATTTAAGA GTTGTGTCAG GACGAAGCTT 
9001 GTCGGGAGTT TTGCTGATGA GAGTCTTCCC AGAGGTAGGT TTACCATTTT 

90 51 AGTTTAGCGT GGGGTAGAGC ACTCCACGAA TCTTAGGGAG CTCCTTGCGA 
9101 CCAAGCTTGG AGATCCTCCA TGTTTTATTG TTTCTCTAGT AGCCAAATCG 
9151 TAGCCGCTCC TAGGAACAAT ^TTTTCTTTT TCGCAATATA AAATCCTGAT 
9201 TTAGAGAATA GGTCTTCAAG ATCGTGGTCC TTTGGAAGTT GCTGGATACT 

92 51 TTTGCTGAGA TAGCTATAGG CGTCGGGATC TTTAGAAACA GACTTTCCAA 

93 01 TCCAGGGGAC GACAGCACGC AAATAGAGCT TATGGGCACT ATAGGTAGGG 
93 51 TGTGTTTTTT TTGGAGGTGT GAGCTCTAGA ATGCCCAGTT TTCCAGAAGG 
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9401 
9451 
9501 
9551 
9601 
9651 
9701 
9751 
9801 
9851 
9901 
9951 
10001 
10051 
10101 
10151 
10201 
10251 
10301 
10351 
10401 
10451 
10501 
10551 
10601 
10651 
10701 



CATAAGCACT 
TCCTGAGGCC 
GGCAGTTGAT 
GTGTTGTTTT 
CTGATGCTTG 
CCTGCGCAGA 
AGAGCGATTC 
GATCATACTT 
TTGGTAGAGG 
GTTCTTCTCC 
GAAGGGGGCA 
TGAAGATTCG 
GATGTTCTTT 
AGCTTTGTAA 
AGGAAGTACA 
CATAATTTTC 
ATATGAGGAG 
TCCTATAGAG 
AGTTAAAGAA 
TTTGCTGCAA 
TCCTAGGGAT 
TGAGGAGGTC 
ATGAGTTGTA 
GCTTATACAT 
GATGAGGGCC 
AGGCCATCTG 
GCACCACAGG 



CGGGAGATTT 
ATAGGCCATC 
TAATATCGCT 
GCAATGTCGA 
AGGGTGTGCG 
GATCCAGGAG 
CAGAAATGGT 
ACTCGCTATG 
GTTCCATAAT 
TAGACGGTAC 
GACCTGTATG 
AACTGTTGGA 
CCAAGAGGTG 
GATCATACCA 
GGATGTTGTA 
AGGGGTTTGT 
TTGGGATGCG 
GAGCGACTTT 
AGTGGGAGCT 
TTCCAAAGCC 
GAGGTCAAAG 
TGCAGGGGGA 
GGGACAGCGG 
GGCTGGAGTT 
GCCATTCCCT 
GAAAACCTTT 
AAAGGAGGTC 



CTTGTAGGGC 
GCTGCTAGGG 
ATGAATAAAA 
GCATTGCTGA 
GCAATATAAC 
AGAGTATCCC 
GCATTCCTAA 
GAATCGAAGA 
ATTCCCGGAA 
TGGCATAGGG 
TTGATGAGCT 
GGGCTTCTTC 
CTGTGTAGAA 
CCCCGAGGCA 
GCGCTGCATC 
CTGAGTACTT 
CCAGAGATGA 
CTAAAGTTGC 
GCATAGAGGT 
GGGGACATAC 
CAACATCGAG 
GCAAGAACAC 
AAAGGAATTA 
GGTTAGACAT 
CAGCGTCCAT 
TCCCCCATAT 
TAGAGCTGCC 
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TTTATGTGGA 
GATAAGAATG 
GAGCAAGAGC 
GGAAAAGTCG 
GCTTCGCGAC 
GACCCTAGGA 
AGAGAGTATT 
TCTTTTTACA 
TTTTTCAAAG 
CATAGTATTC 
TCTTTAAGGA 
CATCGCAAGG 
GAAGAGCAAA 
AGGTCATAGG 
TCCGATTAGG 
TTGTAGTTAT 
CGACAAAGCA 
GGCAATCCGA 
TTACACTGAG 
CCCAAGTTAT 
TTTCCCTTCG 
AGCGAATATC 
ATATAACTTA 
GGCGTTCTCC 
TTTAATAGGT 
GTGTTGAAGA 
TCAATACCTA 



TCCGAGAGGT 
ATTCTCCAAG 
CCTGGGGAAG 
ACGAGAGTTA 
TTTTCCTGTT 
TCTGGATCAA 
GTATTTGTGC 
GTCGGGCTTG 
CTTTCGTAGT 
TTGAAGAAGA 
CTTCTTCGGG 
TTGGGTAGGG 
TACAAAAGGT 
TTACAAATCC 
AGGAGGCCAT 
GAATCTTAGG 
CTTTTAAGAG 
GGTTGCGGTG 
GATACGTTGG 
GAGAGATAGC 
ATTAGCAAGT 
GTTTCTTTTT 
CGCAGCCTAA 
CTTGTTGTGT 
TCTTTAGATG 
AAGGTCATTA 
GGTAATTCCA 



wo 00/27994 



PCT/US99/26923 



10751 TAAGGCTTTC ATATTGGAAA AGTTGTCTAA GAAGATTCGG GCTACTGCCA 

10801 TTAAAGATTT TAGAGGGATG GCATGACCCT GGCCTGATTT TCTTAATCTT 

10851 TTTCCTAGGA CATTATTTTC TTGGGCGAAT TTTAGAAGTA TGAAGTTTTT 

10901 AAAGCCCTGA GTTTCGTCTT GTAAGTCGCG GACTTTTACC ATGTGGGTGA 

10951 CGAGGTCTTC AGGTCCTTCT TTATGATAGC AGAGCATGGT TATATTGCTA 

11001 TGGATTCCCA GTTGATGAGC CATCTTATGG ATGTTGAGAA AATCAGAAGA 

11051 AGAAAGGCGT TTGGGAGCTA AGAAATTACG TATTTTGTCG ACGAGGATTT 

11101 CAGCTCCTCC TCCGGGGATG GAATCAAGAC CCGCATCTTT TAATGTGAGA 

11151 AGAACATCGC GAATAGAAAG GTTATCAAGA TCTGAGAGAT AGGCATATTC 

11201 AATGGCAGTA AGAGCTTTGA TATGGATCTG AGGATCGTAC TCTTTGATTT 

11251 TAGTAAATAG ATCGGAATAG TATTGCAGAT TGCAGGAGGG GA^aACAGCCT 

11301 CCCACGATAT GTACTTCTGT AATTGGAGTT TTTATATTTT GGATTTGCTG 

113 51 TAGAAGATCA TCTGGGGAGT AGAGCCATCC TTTAGGGTCT CCAGGTTTTG 

114 01 CATAGAAAGA GCAAAATTTG CAGCTGAAGT CACAGAAATT TGTAGGATAG 
11451 AGGTACAAGG TTGAGGAGTA GTATACAGTG TCGCCAACCC GTTGTTTGCG 
11501 AACTTGGTCT GCAAAATTCC AGAGTGTGCG TTGATCTTCT TTATTCGTGA 
11551 GGAGGAGGAG ATGAAGAGCG TCTTCACTGC TTAATCGTTC TTGGGCATCC 
11601 AGTTTTTCGA ATATGGAGTA GAGGGGGGAA GTTTTAGGGG GCTGTGGGAG 
11651 GCACGTCGTC ATTTGATGAA CACTTTGATG TACTATTCTC TCGAGATTTT 
117 01 GTAGCACAGT GCTCTGTTTT GTCACATGTT TTTTTTTGGC AGCAATCTGG 
117 51 TTTTTGACAC CCTTTAGAGA GGGGCCTGCC AAGCAAATGG GAACCTACAA 
11801 GTAGAATACC CATCCCTAGA CCTAACAATA CTGTGGCACA GCAGATTACG 
11851 AGAAAAAGTG •TCATCATAAG AAATCCTTAG ATAGGATAGT TTCTTAATTT 
11901 AAATCCACCC AGATTGGGGA ACTCCAGGCC ATAGCATTGT CTGCCTGAGT 
11951 GACCCTGAGA TAGTAGAATA CAAAAGGTGC TTTACCGTTT GGATCTTTTA 
12001 GGGTCACTGA ACTTAGGGGT ACCATATCAT CGTATTCATA GTCCAGGTTA . 
12051 TTGCTATCGG GG AAGAAGGT ATGGAGAACT TCGCCATTGC GGATGATTTC 
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12101 TACAGTCTTG AGTAGGGCAG TGCCTGCCAC ATGACCAGAG ATGTGACGGT 

12151 TGACGTTGAG TCCAGGTTTC GACCCTGTGG AGAGTTCGGA GCCCATAGGG 

12201 GCTGAAGTGA TGTTGAAGCT TAAGACGATC CTAGGTCCTG TTGTAGCGTA 

12251 GCAATGACGT GCGAATAAAG CTTCAACAAG AGACTCTCGG GTATATTTAT 

123 01 TACAAATGAT AGCCGTCAAC CCTGGGGAAT ATTGCACTTG CGGAGAGTCA 

123 51 AAGTAGTCTT TATAAATTCC TCGATCGTCG AGACCCCCAG CAACAAATCC 
12401 GAAGCGGAGA TTCTTCTTTA ATCCTTCAAT TACTGTACCT CGAGGATCTT 

124 51 CGCTATCTTT ACCTTGGATA GGGAAGGGGT TGTTTAGAGC GGCTGTGGTT 
12501 TCTGAAGATC CCCAGGCATT ATAAATTTCT ACAACTCTTT CGAACTCGGG 
12 551 GTAGAAATTC TCAAAGTCAA AACCATGTTC TTTAGAAGCT GTGAACGAAG 
126C1 GAATAGAAAT CATGTCGTGG TTGACAGTGC TTTTATAGAG CTTGGCGAGG 

12 6 51 GGAATATGTT TGTATTCTTT GTGTTTCGAG TGGGACTTTG TTTCCTTGGT 
127 01 ATGAAGGATG TGACGCACTC CCTCGAGATG AGGTTCTCCG CTATATTGGA 
12751 ATCCGGATAG TGTGATGAAG CGATCTTCTT CATTAAAGTC GGAGACAGTT 
12801 TGATTGATGA GCTTCCAAAT ATCTGGAGAG AGGTTCTCTT GATTTTCGAA 
12851 TGATGAAGAA GCATAGAAAT TCAGAGCGCG GTCATCTCGG AAATAACGCA 
12901 TACAAGTTTC AATATTTTCT TCAGAGTCGA CGCGTTCGGA TTCGCCGTGG 
12951 AGGAGACCCC ACATAAGATT CGGGGCGGAG TCAGCGAAAC ATTTGATAGG 

13 001 GGCAGAGATG AAAATTTCTT GTGTAGAGAG GTTTTTCAAT TGGATGCGAT 
13051 AAATTCCAGG CTCATTGAAA TAGAGATTAG GAAGAATAAC AAAGCCTGTT 
13101 TCTGGGATGA AGAGCTGCCA ATTTAAATTT TCTCTAAGAT GCTCGTAGGA 
13151 AAGCTCGATT CGGGTCTCTT CAGGAGAGAA GTTGGTGAGG TTCCCGAATT 
13201 CGTCTTCAAA TCGCACGGTG ATATCGAAGC GTTTGTTTTT AACGACATAG 
13251 GAGGGAGTAA AGATCTCTAT TTTTTTTAGG ACGTTTCCGC GGATATCCAT 
13301 AGAGAAGACA TCGGGTTCAT CATAGTTTCC TTCTCCTGTA GGATCGATGT 
13351 AGAGGTAAAA GGGTTTGCGA CGTTGTGCGA AAAGTTGGGC TCCGTTCCCA 
134 01 GCATCATCGA CTTGAGGATG GTTTGGAGAG GCTCCCATGA CAATAGTGAG 
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134 51 GGTTTCTCCT ACTTGAAGTT 

13501 CGGGATTGTC TTTTACAGGA 

13551 GGCATTTCTG CGTAGATTAC 

13 601 GGCTTCCCAA TCTGTGGGTT 

13651 TGGTTCCAGC TGGTAGTGGT 

137 01 GAAATTTGCC CTGCTCGAGC 

13751 TCGCATAGTA AGAGGGAGGC 

13 801 TAACAGTGAG GTTTTTCTCA 
13851 TTCATCCACA GTATTGAATA 
139 01 AGAGAGTTTG TTCTATGGTT 
13951 GAATGTTGTC CTACTTGATG 

14 001 ACTGGACATG TAATCGAGGT 
14051 CAGCTGCTAG GGATTGGGAA 
14101 CCATCAAGTT TTCCGTGGAG 
14151 TTTTTGGATA TCACTCATCT 

142 01 AGTTTTCACT CTGATAAATC 
14251 GTGATGTTTG AGAATTTCGA 

143 01 GCAAGGAAAT CGGTCCTAGG 

143 51 ACGTTTTCTT GGTGTGATTT 

144 01 TAATGATTTC TATATCGAAG 
144 51 GGAACAGCAC CCACTTTATG 
14 501 ATTTTCATGA TCGATTTTCA 
14551 TCATTCTTCC ATCTACAATC 
14 601 TTTTCTTGTA CGTCGTGCTC 
14651 AATATCTTCC ATGGTAGCGA 
14 7 01 TGATGGCTAG ATGGCGATGT 
147 51 GCTTTTTTTA TTTCTGGGGC 
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CGTAGGGGAG AGTAAACTCG AATTGTGGAA 
ATGGCGGTTG CTTCGATGAT TTCGCCTTCT 
GTTTCTAGTT TGGGAGAGAT CTGTCGCGGG 
TCCCACTTCC TGCTAAGTCA AATTTACATT 
GTGGCAAGGG AATAAAGAAA TTTCCAAGTA 
TATCGAAGGG TTAACGTAAC AAACAGATCG 
TTTATATGAC TTAAAAGCGC CATCATATAC 
ATCCCCGTCT TTGTTTAGTG TTTGTATCGC 
TTTTAAAGTA AGAAAGGAAT CCTGTAACAT 
TTTGGGACTG TAGTCAGGAC AATTTTCCCA 
GTAGCTTTGC AGTAGGACTC GGATACCTGC 
GAGCACAGTC GAGAATGATA TTTTTGGATC 
ATATTTTCTT GTACTTCTGG AGAAGAAATT 
ATGAAAGATT GTTGTTGAGC CGTGTTCTTC 
AGATAGTTCT CCTAACTATA CGGGAGCTTA 
TTTAGCTTTT TTGCAAAGAG ATTTTTATTG 
TTGGGGGGAG GCAGGATGGG ATCGTGGAGT ^ 
ATGTTTACAT TCTTAGGAGA TATTGAATTT 
TTAGTTTTCC GACATTTCGT TCTGTGCAGG 
TTTTCGTGAT GGATACGCAT TCCTTTTTGG 
GAAGACATGT CCTCCTAGTG TATCGTAGCT 
AATTGAAGTA CTCTTCAGCG TCGGAGATAT 
CAAGAGCTTC CGATTTTCTT ATAAGGAGTA 
GTCTGCGATC TCTCCTATAA TTTCTTCGAT 
TGCCTTCTGT GAATCCGTAT TCATTGACTA 
TTTTGTCGGA ACTCTTGGAG AAGAGAGGAG 
ATAGAATGGG GGTTTGCTAC TGAGGATATG 
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14801 
14851 
14901 
14951 
15001 
15051 
15101 
15151 
15201 
15251 
153C1 
15351 
15401 
15451 
15501 
15551 
15601 
15651 
15701 
15751 
15801 
15851 
15901 
15951 
16001 
16051 
16101 



GGTTGGCTGA 
AAGAAGGATT 
GACTGTAGCC 
TCTTCGGGAA 
AATGAGGTTA 
ATGTTGAAGA 
TGCAGTGGGA 
GGCGGTGGTT 
CGTAAGCTAT 
TTGTGAGCTG 
TCCATAGAGG 
AGGGGGGATA 
TTTAGGAATT 
TAGGAATAGA 
GCTGTTAGCA 
TTTAACTCTC 
GCATATGGAG 
TCCTCTTGGT 
GCTAATGAAT 
CATCAATAGG 
TTATCATGGA 
TTCATTAGTT 
CTACAGAAAC 
ATTTTGATCT 
AAACTTATTT 
GTACTATCAT 
ACGCTTCAAA 



GGTCGTGGCT 
CCTGTGATGT 
TTCTTCGCTT 
GTGCGAAAAT 
TCAAAAGCGG 
TCGTACTTGT 
AGAGACCGAG 
TCTTTAGGGA 
CAGGGCGCTT 
TTTTTGGAGG 
ATGCAGAGCA 
CTCTCTTCCT 
TTGAGGATCC 
AGAATACAGA 
AAGCATGTTT 
ATTTTTCTCT 
AATAGAGTGG 
TTGGGGATGT 
GCTTCTCCTA 
CAGAGTGATC 
GTTCTGCAAG 
GTTACTTTTA 
CAAATGAATA 
TTTCTTGCGT 
TAAAATAGGG 
TCCAACGGCC 
ACATTTCTTT 



GCTTGTATAG 
TGTCTAAGTT 
ACGAGAACCA 
ATCTACTTTT 
AGAGGGCTTC 
TGGTTAGGGC 
TTGGAATACC 
CTTTTGTAGA 
AGAGAGTATA 
GAGGAGGGTA 
GCGTGGCGAG 
TTATCTTTGA 
GTGACAGGAC 
ATATGGCTAA 
TTTTCTTAAC 
TTTCTTCTGA 
ACGAGGTATC 
GTTCTCTAAA 
AAACATGAGG 
GTATCTGTTA 
AGCTTTATCT 
AGTGCTCTAA 
GGAATACATG 
CACGCGAATG 
GTCTTAGGTA 
CAACTTACGC 
TGGTAACCCC 



AGCAGTAAGA 
TTTTTTATAA 
GAGCTTCTTG 
GGGATCATGA 
GGAGAGCTGG 
GGCGTCTGTA 
GAAGCTAGAA 
GATCCATGGG 
GGGGCCAGAA 
TAGAGTTTTG 
AATTGTAGGA 
AGAAGCGTTG 
GGTTGCGTAA 
AAGAATATGG 
ATACACAGGA 
TGAGGTGTCG 
TCGAGATTTC 
AACCTAAGAG 
ATAAGCGGGA 
GAGAAGGATC 
TCTAGGAAGT 
GAGCGTAAGA 
TTTGCTCATT 
AAATTCCCAT 
AACCTGTGAC 
AAGACTTCTA 
TTTGACAGAT 



GATCTTTAAC 
ACGGGAACGC 
TAGTGTAGTT 
CTTCACGGAC 
CTTTGAAATG 
AAAGAGCAGT 
AACGGAGGTG 
GGGAGGAATC 
TAGGAGATCT 
TCCCGAGAGC 
GCACTGGGGA 
GTTTAGGGTT 
GCCCGAAGGC 
AGCATGTTAA 
TTTGATTTTC 
TCGTATCCGA 
TTCGTAGATA 
CGGCCTGTGG 
TCTCCGGGAG 
AGCAAATACC 
AGATAAAAAT 
ACCAGCTTCT 
GGAAACATGT 
AGAGAAAAAC 
TTTTTTCGCT 
CTCGCTCAAA 
TTACCATAAC 
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16151 TACGATGTCG AGACATAATC 

16201 AAAAATGCTA TCACTGGTGC 

16251 TTCTTGGAGC TGGCGCTGGA 

16301 TTAGGACTTT TCGCTCTGCG 

163 51 TAACTCGATA TTGAATTTTT 

16401 ACGGAGTACT TTATAAGAAA 

16451 GAAGATGACG TCTTTTAGCT 

16501 GGGAGACGAG GGGCATGCCA 

16551 TATATTTGAA AGCATCTTCG 

16601 TTTCTAAATA GGGTAAGAAA 

16651 GAAGACCGTG ACGAGGGGGT 

16701 TGGGTGCGTC TCCAGCGACG 

16751 AAAGGAACCA TAACTACTTG 

16801 GGTAGTCATG ACTTGTTGCG 

16851 TTCGGTGTAA GATTTTGTCG 

16901 ATTCTCGTGA GGAAACCTAG 

16951 TAAAATGATT TGCAATACGA 

17 001 AGCTAAATTT CTCAAAGAAT 

17 051 AGGAAGTTCA TGATCATAGT 

17101 GAG.AATAACA AGTCCTGTAA 

17151 ATATGAGGAA ATGGGCATTT 

17201 AAGATAAAAC TTTTCCCGAA 

17251 ACATCACGAT ACTCTTGAGC 

17 301 TAGAGTCCCA GAGATTTTTT 

17351 GGCGTTGCCT ATTGTTTTCC 

174 01 CCGGGTCCAT TGGTGAGAAG 

174 51 TTGGATCATA AGTTCTTTGC 
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CTGCTCTAGA AATAAACCTA TTTTCGGGAT 
TCAATGCATT CGTATGCAAT TTATATACAA 
TGCACAATGG CATACTTAGG TTTACGTTTT 
CCGAGCTTGT TTACTCATTC GTGTCATAAA 
TAATTTCTCC AGACAACGGA AATTGAGGAT 
AAGGATAGTA AAAGAAGAGT TTTTTTTCAA 
GCCTTGATCT TGGTGTAGCT CGTCAGGGAG 
TGGGGGTTGA GAGGACTCCA CAGGAGATAA 
ATTTTCATAT CTAGGAATAC GATATCAGAT 
CCCTGAGGTG GGGTTGGGTG TTGTTGGGAT 
CGTCTTCCTT TTCTCCTGTG CAGCATACTG 
AGACCGATGC ATTGAACATT TGCGTTAGGG 
TTTGAAGGAT CCTGATTTTG ATCCAAATAT 
CAGCTTTATA CACTGTTTTA ATGATGGGAA 
TAGATAGAGA GTAGGGATTT AAAAATCATA 
GAGCACTGTG GCGAAAAAGA GACCGAAGAG 
ATTTTAGAAG AGCTCTATGT TTAGTATAAA 
TCCGAAGCCA AGCCTACGAA GGGTTGGGTT 
AACAATAGCA ATAGTAATTG CTAGAGGAAG 
TAAAGTATTT TTTCATGATT CTCCTGCAAG 
GTTTCTTTAC TATACAGCTT AAGATTATTT 
TCTTCTGGGG ATAGGAGAAA TCTCCATGGG 
ATAATCGATG CCGATCCGGG CAGTTGCTGT 
CTTTGCTGAT ATAGAGAGCT GGGGTATTTA 
AAAGAGATTC CTAGAGCTTG GCACACTTTT 
GTGTGGGGGT TTATCTCTCC ATTGGCGGCG 
CTTGATCAGG AAGGATGGCC CGGATCAGGA 
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17 501 CGGCATGGGG AATGTCCTCA 

17 551 ATGCCATAGC AACGGTAGAG 

17 601 TCTGTTCCTC TGAGTTTTTC 

17 651 CAGGGCCACG ATACGCTTCG 

177 01 CCCTCATGTG TTGTGATGAG 

177 51 AATTACATCT TCCGATAGAA 

17 801 CTTTTTTTCG TTCCTTTTTT 

17 851 TTTATCTTCT GTGGTTGTCG 

17901 GATTCACGGA ATCAATAGTG 

17951 TCTGGGATAG ATTCTGGAAG 

18001 GAGTTCTGCT GCTGCAATGA 

180 51 ATGAGAGTCC TTCATGATTT 

18101 TTAGGTTGCT CTTGCAAAAA 

18151 TTCGAAAGAA TTTTCTGCTT 

18201 CTCTTACGAT AATTTCGAGG 

18251 AGAAGCCTGT GAACAATAAG 

18301 GTGGGTGTAG TAGTCGAGCT 

18351 AGTAGGAGGC TGTTTTCATA 

18401 TCTAGGGGAT GTCCTGCTGA 

18451 TTCTTGTGTG GGAGTGAACG 

18501 CTTGGAAGGC GAGTAGGTTT 

18551 GGTAGAGAAA CGCCTTGATG 

18601 TTTAAGCATA AACTCTTCGA 

18651 CTATCAGAGC TACGGGTTCT 

18701 GGGAGGACAA AGCGAATGCA 

18751 TTTACTTAGA GTGGCCATCT 

18S01 GTTTCTTTTC AATGATGTTA 
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GGTCCAGTGA CAACATTCAA TAGGTGATGC 
GTAAGCAGAG CCTCCTTTCA GGTACATCGC 
TGTAGTTGTA GGCGTGGCAT GCTTTGTCAT 
GTTTCTACAA TGTAACCTGA AGTTATCAGA 
TTTATGTCCT AAAAGCTGTT GCGCTAGTGT 
AAAAATGTTC TTGTAGCACG TTACGAGGCT 
CTTAGAAGGC GTTTTCTTTA TTTTCTTAGG 
CTATAGACCA GACGATTTTT TGCGTAAGGA 
ACTTTTATAG AAGCTCCAGG TTTCATTTTA 
AGCGTTTTTC TTTAGGGAAT ATTCTTTAGG 
ACCCTTCATG GCAGAATTCG GTCACTACAA 
GCAGTGATGA TATACGCATG GTATGTAGTT 
TTTATTTATG AACCGAGTTT TTTTGAGGTT 
TTGCGGATAC TCGTTCTTTT GTAGAGCATG 
TGCGTTTGGT CTATAGATAG GGGGTTGAAG 
ATCGATATAT CTACGTATGG GACTCGTAAA 
TAAGTCCGTA ATGACCTTTA TTTTCTGTAG 
CTTCGGACAA ACTGCGAGTG TAGAACTTGC 
CGTAGTTTGC AAAAGGTATT GGTAATCAGG 
TGATATCAAA GCCCATGTTT TTTGCCAATT 
TCATCATTGG GAGGTTCGTG ACTACGAAAA 
GGAGATATGA TAGGCGACCA CTTCGTTTGC 
TGAGTTTATG GGAGAAGGTC TGGTGGTTTT 
TGAAGATTAT CCAAGGACAT AGTGACTGAG 
ACCACGTTCT TCACGGATAT CGGAAAACTT 
CATTGAGGAT TTTTGAGAGG GGGTGGGAGT 
TCGACTTCAT CGTAGGTCAT ACGATATTTG 
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18851 CTTCGAATGA CGCTACGGAA 

18901 TGTAAACGTC ATAAATACGG 

18951 AGCTGCAGAG ATTATCAGAG 

19 001 CCTGGGAAAT ATGTAGAGTT 

19 051 AGAATGTGGG GTAACGTAGT 

19101 TGTAATTGTT ATTATGATCG 

19151 CTGGCTGTGG AAGAGTCTAT 

19201 GCGAGAGTGG AGAACTTGGG 

19251 CTTCAATGAC CTCTGGGGGG 

193 01 ATTGCCTGAA AGTCCGCTTT 

193 51 CATTTGTAAG GCTGGAGAGG 

194 01 GAGTGCTCAG AAGAATGCGA 
19451 AGTTCTACTG GAATTAAAGA 
19501 TGCTGATGTG GGACTGACTA 
19551 TTCCTCTTGC GAGTACTTCG 
19 601 CTTGGATAGG GAAGCACGGA 
19 651 GCGTAAATCT CGGGCGGGAA 
197 01 TGTCGGGAGA AACAAAACCG 
197 51 CCTGGAATAA AAATCTTCAA 
19801 TCTTTTTGGT TTTTTCAACA 
19851 GAACGAGAAA TCCCGTAGTT 
19901 TTTCCCTTTT AAGTGGGTTT 
19951 AAAGCCAATT 'GTACTAAAGA 
20001 AATAAGAGAA TGGAACTGTT 
20051 TATGAGGATG GGAATAGAGA 
20101 GTCGTTATCA ATAATTTCTA 
20151 AACCGTTATT TGAAGGAGGC 



AATCTGGTAA TCTGAAAGAT GACCTGATTT 
ATACAGCGAG TCTATCAACG TTTGGTTTTA 
AGTGCTGATG GCAACATGGG AATGACTTTC 
ACAGCGTTTA GCAGCTTCTT TGTCTAGGTG 
GGGAGACGTC TGCGATGTGT ACACCAAGAA 
TAGGTGAGGG AGATGGCATC GTCGAAGTCT 
GGTGAAACAG AGGAGATCAC GGAGATCTTT 
TAATGTGTTT TTGAGAGAAA AGGCTTGCTT 
AATTCTTCGG CAAGGTTATA TTCGGCTTGA 
AGCGTTGGTG ATGTGGCCAA TAAATTCGAG 
CTCCTTCTTG GGGTTTATCT ACCCAGGGAG 
TCGCCGATTT TGTAAGTGCG TCCGGGAAGG 
TTGGGATCCC GACATGCTTG TGTAGGCAAG 
GTGAGGTGAT CGTTCCTACG AGTGTTGTTT 
CTGATAGTGC CTTTGAGTTT TTGTCCGTCT 
GACAATCACG TGGTCACCAT CTAGAGCCCC 
CAAAAATATC AAATGGGTAT TCTTCGGGGT 
AAACCTTTTC TAGCATGAAC AAATAGGGTT 
GGATTTACCG TATGTTCTTC TCCCTGGTTT 
ATTGGGCTCC GCCTGTAAGT TTAGGACATT 
TCACTCGTGA ACTGAGTGGC TCAACAAAAT 
TGTGATTATG AAAGGAGCAG TCTCAAATTC 
AGCTCTGTTA TTGCAACTCC TTGATCAGAG 
TTCTGTTTAT AAGGAAGTTT CCCATCCTCT 
AACTTAAATT GAAAATTTTG ATTACTTATC 
CATCAGC'TTC TTCGATATGG TCTTCTGAAG 
TTCGTACTGA AACTATGTTT TTTCAAATCT 
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0 n m 




TPTTAGGTCC 


ACCTTTAGCA 


TTGGCTGCCG 


ATGATGCTGC 


ZD X 




GACTGCGATT 


GCATAGACTC 


TCCAATTTTT 


TGCATATGCT 


^ U J U X 


J. \jV«» 1 1 rt\JO X ^ 


TTCAGTAACC 


TCTTTAATTT 


TTTCAATAGG 


AGCGTCATCT 


Z U O J X 


TTriAnTGCGT 

X X VJrtw X w\^v? X 


TGCGCACGTT 


TTCGATTCGC 


TCTTCGATTT 


CTTTAACTAA 


9 n A n 1 
z u y U X 


111 ^rtVjVJrt 


ATTTGCTCCT 


TATAATCTTT 


AATAGCTTTT 


TCGGCTCTGA 


Z U J ± 


Ar* ATP ATGPT 


ATPGGCTTCA 


TTTTTAGCAT 


CTGAAGCTTC 


ACGACGTTTT 


Z U D U J. 


TT ATPTTPTT 
1 iMl^i i^»>i 1 


PPTTATTAAT 


TTCGGCATCT 


CGAACCATTC 


TTTGGATTTC 


Z U D D X 


ATPTTPTTGA 


AGTCCTGAGC 


TTGCTTCGAT 


ACGAATTTTC 


TGTTCTTTAC 


Z U D U X 


PPPTPPPAAP 


ATCTTTAGCT 


GAGACATGGA 


AAATTCCGTT 


TGCATCGATA 


Z U D J X 


TPPA AGGAGA 


TTTCGATTTG 

X X X».\Jrt X X X V7 


AGGATGGCCT 


CGAGGAGCCG 


GAGGGATATC 


9 0 9 0 1 
Z U / U i 


•PPT a AP ATPP 


A ATPTTPPGA 


TTTCCTTGTT 


ATCTTTGGCC 


ATGGGACGCT 


Z U / D X 


P T PPHTTPP A P 


AAPTAPGATG 


GTAACCGCAG 


CTGGTTATCA 


GCAGCTGTGG 


9 A Q n 1 
Z U o U 1 


2i P 2i ^ P A T"T"T'P 


TTTTTTPTGT 

1 1 1 1 XX^XVJX 


GTAGGGATTG 


TAGTATTTCT 


CTCTACCAGA 


9 n 0 m 

Z (J O D X 


PTPPTP ATP A 


PGrPTCCTAG 


AGTTTCGATA 


CCCAGAGATA 


GGGGGATAAC 


9 n Qm 

Z U J? U X 


PTPTAP A APT 


AGAACATCCT 


TAACTTCTCC 


GCCAAGAACA 


CCACCTTGAA 


9 n Q m 

Z U -7 J X 


TTPPPPPTPP 


AATAGCAACA 


ACTTCGTCGG 


GGTTGACTCC 


TTTATTAGGC 


9 1 n n 1 

Z X U U X 


TPTTTPPPP A 


AGAGTTPTTT 


TACAGTTTCT 


TGCACTGCGG 


GCATTCTTGA 


Z X U D X 


PATAPPTPPA 


AGTAAGAGAA 


CATCATCGAT 


ATCCTTAGCG 


GAAAGTTTTG 


Z X X U X 


PPTPAPTPAP 


TGCTTTGATG 


CATGGAGATT 


TTGTTCTTTC 


GATTAGAGAG 


9 1 1 1 

Z X X J I 


PPTPPPAPTT 


TCTCGAATTG 


CGCACGTGTG 


AGTGTCAATG 


CAAGGTGTTT 


919m 

^ X Z U X 


APPTPPTTGT 
/-^Oo 1 ^V-- 1 1 o X 


GCATCCATTG 


TGATGAATGG 


CTGATTGATT 


TCTGTGGAAG 


9 1 9 1 

^ X Z _J X 


APAPTPPTGA 


AAGTTCTATT 


TTTGCTTTCT 


CAGCAGCATC 


TTTAAGTCTT 


91 '^01 

^ X J U X 


TGTAAPGPCA 


TATTATCTTT 


GCTAAGATCA 


ATGCCTTCTT 


GTTTTTTGAA 


91 m 

£. X J ^ X 


TTPTTPGATC 

1 1 \^ X 1 wVJ/^ X 


ATCCATTTGA 


TAATGACTTC 


ATCAAAGTCG 


TCTCCACCGA 


21401 


GGAGAGTATC 


TCCATTTGTA 


GATAGAACTT 


CGAAGACGCC 


ATCACCGATT 


21451 


TCTAGGATGG 


AGATATCAAA 


AGTTCCTCCA 


CCAAGGTCGA 


AGACAGCGAT 


21501 


TTTTTTATCA 


CCGACTTTAT 


CGATTCCGTA 


GGCAAGAGCT 


GCTGCGGTAG 
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21551 GTTCTGGAAT GATACGTTTT 

21601 TTTGTGGATG CTCGTTGAGA 

21651 TGCTTCTGTG ACAGTTTCGC 

217 01 TCATTAAGAT TTGTGCGCCA 

217 51 ACTTCGAAAA CGGCATCACC 

21801 GGTTTGGATT TCCGAAGCTA 

21851 TTGTAGAGCC GAGAGTTTTT 

219 01 GGAATCCCCA CTAATTTCTC 

21951 GGTTCTTGTT CCTTCGGATG 

22001 TAACAGATAC GCAGGAGTTT 

22051 CTTGATTTTT TGTGTTCACT 

221 01 TTCTATTCTT TATTTTCTTT 

22151 AGCTACCCGA ATCGGGCGTT 

222 01 CTAAAATCGT CCCCTCAGGA 

22251 TCGTGTAGGA AGGGGTTAAA 

22 3 01 ACCTTTTTCC TCGAAGATTT 

22 3 51 CGAGGGCCCA ATTTTTTACA 

224 01 GCTTTCTCCA TGCTTTCTAT 

22 4 51 TAAAGCATAC TGCATAAGTT 

22 501 AATTCTCAGA TTCTGCTAGA 

22551 AATTCGGTTT TTAGGGTGAC 

22601 TTCGTTTTGA ACATTGCTTT 

22651 CTGTCATAAC GTCTCCTTAG 

22701 TTCTTAAAAT AGGCTCATTC 

227 51 CTGAAGGATA GTTTAAATTT 

22801 TTTATTCGCA AATAGTTTGA 

22851 TGATCGGGCC TAGGATACCT 
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ACATCTAGAC CTGCAATGCG TCCAGCATCT 
ATCATTGAAG TATGCGGGGA CGGTGATCAC 
CTAGATAAGC ATCAGCTGTC TCTTTCATTT 
ATTTCTTCTG GAGTGTATTG TTTGCCATCA 
TTTAGATCCG GAGGTGACTG TATAAGGAAC 
CTTCAGAGTA CTTACGGCCA ATAAAGCGTT 
TCTGGATTTG TCACTGCTTG ACGTTTTGCT 
ATTACCTTTG AAGGCAACGA TCGATGGCGT 
ATGTAATTAC TTTAGCTTGT CCTCCTTCCA 
GTTGTGCCTA AGTCTATACC TATAATTTTG 
CATGTTTGGT ACCTAATCTC TAGGGGTTAT 
GGGAGTAGGA GCTTTAGCGA CTTTAACTTT 
CTCCTATTTT ATATCCCTTT GCAAACTCTT 
ACTTCAGAAG TCTCTTCTGT TTGCACCGCT 
CTTTTGGCCT ATTGAAGAAT ATTCAATAAT 
GTTTGAATTG GTTGAGAATC ATGTTGAATC 
TCGTCGGACA TTTGTGTAGC AAATCCGAGG 
GGGATTGAGA AAGTCTATTA AAGTATTTTC 
CTTGGCGTTC TTTTTGTAAG CGTTTTCTAG 
GCCATGAGAT ACTTATCGTT TTTTTCTTTT 
GATTTCCTGT TGCAAATGTT CAACTTCATT 
CGTGTTGTTC CTCATTTTCA GGTGGGGTAT 
AGGGTAATAG TTTTATAGAA GAGTACTCCG 
GAAAGCTTAC AGTTAGAGGT GAGTGGTCTT 
GTAGAAACTT TGTGTCAGGG TTTCATTTAT 
GCAAAGGAAG AGCTTCCTTA TAAGGAAGAT 
AAAGCTCCGA GTGGAGAGCG ATTCATATAA 
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22 901 TAGGGAATAG TAATTACAGA 
22951 AGAAAGCTCC TTCCCTATGA 
2 3 001 TATTTAGAAG CTCACACATT 

23 051 AGAGCTAGAA CTTCAGGATC 
23101 CATTCCTGTT TGATAGAGAT 
23151 GATAGCGGAC AACCACCTCA 
23201 TTTTTCGAAA GTTCCTCATT 
232 51 GAATTTTTCT ATACGTTTGA 
23 301 ATAGGGTGTC TGTGAAGATC 
23 3 51 GCTCTTTGCT TATCGACCTG 
234 01 TTCAAAGCGT GGGGAAGAAA 
23 4 51 GAAGTTCCGT AGCTTTTTGT 
23 501 GGAAGCTGAC TGATCTTATC 
23 5 51 GCATTCTTCT TGGTGATCTA 
23 601 TTCTTCCTCC GG AAGTATGA 
23 651 TCTGCAAAGT AATTTCTTAT 
23 7 01 TTCCTTTAAA GTTTTAGACC 
2 3751 CTGTTGTAGC AAACAGGATA 
23801 TTGGATCTAG CCATCTCGAG 
23 851 TATAGGAAAA GAACCCGAGA 

23 901 TGAATGCTAA CTTTTTTACG 
23951 GTGATTCTGA CACCAAGTAG 

24 001 CTCCAGATCG GG ATTCAATT 
24 051 CCTAAAATAA GCTTATAAGG 
24101 AAATCCGAGT CTTTCATCTC 
24151 TTTGTAGCTC ATGATAAATA 
24201 CCTCCGTTAA AGGCGATAGT 
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ACATCCTGGA TTCGAGGTCC CTAAAATATC 
ACGCTGTAGC TCTTCCTTTA TGCATTCCTA 
TGTCTGCGAT TTTCAAAAAG AGAGAGTCCT 
TTTAAACGCT TCGTATTTCA GTAGTTTCGA 
CTTCTTCACT AAAGTTGCAG TAGCGTGTTA 
TTATAGAGGG ACATGCTCAG GTGTTCTTCT 
TGTGGGGAGC TTTCGGATGT AGTTCTGCAG 
TAGAAAGAGT ATCGCAAGCT TCAGGCAGCC 
TGACCAAACT CCGTAGAGAG GATGGTGACA 
TGTAATTTGA ATATTGGTTA CGGAATCATT 
AAA.ZVCGTAGG CAGGTCTAGG ATTTCTCCAA 
AGATCCTTGA TAATATTGCG ACTTTCGCTA 
AAAAATGGGG GCAGAAATCT CAGCTTCTGG 
CATAGTGACG TAATGCTAGG TCTGTAGGGA 
TTTTTTTTTA AGAATCCTTC AGCTTCAAGT 
AGTTGCCGTA CTCAAATCAG AGCAAAAACT 
CTACAGGCTG CCCTGTTTTT AGGTACAACT 
TCAAGGATTT TTGAATCTCG CTTTGAGACT 
TCCTACTAGA ACAATCGTAA CTGAGAGCAT 
AGGTCAAGAG AATTTAGCAC TCGAATTGAA 
AGGAGGGCAG CGATCAAAGA GCTAGGCTAA 
GGAAGGCCTC CGGGGAGACT GTATACTTTT 
TCGAATATTC CCGAAGATTG GTAGGACTTT 
AATGCCGATA AGGTCACTGT CTTTAAGTTT 
GATCATCAAG AAGGGGCTCA TAGCCTTGAC 
GTTTCCGCAA GCTCTTGAGA TACAGTGTCT 
GATAGAGAAG GGAGCGAGTG CTTTTGGCCA 
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242 51 AACAATACCA CGGTCGTCGG 

24 3 01 TTCCGACTCC AATGCCGTAG 

24 3 51 TGTTCATCTT GGAAGTTTAC 

24 4 01 ATTGAAAATA TGAGCAACTT 

24451 CAGGATTTTC AGGACATGTG 

24 501 TATTGGGGGG GGAGGAGGTC 

24 551 ATCTTTAGCA TTGCCCGCAC 

24 601 CGTCTGCGAA AAAGTCTATG 

24 651 TCTGTGCCTA GAACGCGTTC 

247 01 ATCGGCATTC AGTTTGGAAG 

24 7 51 CTCTCATTCC AATGGCAATG 

24 801 ACGACAAGGG TTTTTAAAAT 

24 851 TAGAGCTTCT ATTGTTGTAA 

24 901 GAAACTCGCG ATCGTAGGCA 

24951 ATATTAGCTC CATAGGAACC 

2 5001 GCAAAGGACC TGAAATTCCT 

2 5051 CAGCTGTAAC GATGACATAG 

2 5101 TACGCAGAGC GGAGTTTTTC 

25151 TGAGAAGGTA TAGCTGTCTT 

2 5201 CGAATCGAGG GCGAATCTCG 

2 5251 TGGAGAGGAA GTTGTCTTTT 

25301 GATGACCTCT TCATGTGTAG 

253 51 CTTTGAGAGT GTAGAGCAGT 

25401 GTATGTTGCC AAAGTTCAGC 

25451 ACCTCCAATC GCATTAAGTT 

25501 CCACGCGCCA TAACAGGGGT 

25551 AATAGGTATC CTGCCTTTTC 
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CAAGCTGTTC TACACAAGCG GCTAATGTTC 
GTCCCCATCC AGCACTGCTG GGTTTGCCCG 
CTCAAAACTA TCGGTATAGC GTGTCCCGAG 
CTATGCCTTG ATAAATGCGG TAAGGATGGC 
TCTCCCTCTT CAGCGAGTAG AAAGTCACCG 
GCGATCCCAG TTTACATTTA CGTAGTGCTT 
AAACAAAGTT CGTCATTGGG GACGTTGTTT 
GGACAGTTTA GGGGACCGAT GAATCCTTTT 
GATTTCTTCA TGAGAAGGTA GAGCAATATC 
CGACCTTCAC TAGGTTGACT TGCGGATCTC 
AATTTTTCTT CATTTGAGTA GGAGAGTTTT 
TTTATGTAAG GGGATAGAGA AGAAGTTTGC 
TCCCAGGGGT GGCCACTTCT TCGACGGGAA 
TGCTGTGGAG GAATGGAGAC AGCAGCCTCA 
GCTGACGCAG ATCGTGTCCT CGCCTAGAGA 
CAGACTTTCC TTTGCCGATT TTCCCTCCAT 
GCAAGACCGA GACGATCAAA GATCTTACTA 
ATATTGCTCG TTCATTTGTT CGGGAGAGTC 
CCATAAGGAG CTCTCGAGAG CGAATGAGAC 
TCTCGGAATT TTGTAGCAAT TTGGTAAAGG 
TGAGGAGAGC CATTGTGCAA CAAAAGAGCA 
GAGCTAGGCA ATGAGATTTT CCTTCGCGGT 
CCTTCCGAAG TAAATGCCTC CCATCTCCCT 
ATTGTGGAGA AGTGGGAGTA GAAGTTCTTG 
CCTCTCTAAT GATGTTCATC ATCTTGGAGA 
GTATAGGTAT AGACTCCTTT ACTTACTTTA 
TAGGAGCTCG TTTGAGAGCA CAGCAGCGCT 
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2 5601 TTTATTTGCA TTTTTTGAAG 

2 5 651 AGTGGGGCTG TTATCTTCAG 

2 5701 TTACAGGCTA TTGCGATTTT 

25751 AGTTTAGCTA TGAAGGAGGT 

25801 GTTTTGCGTT GGGTTCTTAA 

25851 TTTTAATTAT ATTTTTGTTT 

25901 AATTAAGTGT ATACTTATTA 

25951 TCGTCAGTAA ATCAAAGCTC 

26001 TCCTGAATCT ACGGAAGAAA 

2 6051 AAGCCACGCA TGCTGTGGCT 

26101 GAAGGTGTGG GGACCTCATC 

2 6151 CGAGATTGTA GCTGAAGTTT 

2 6201 CATCACTTGT AGAGCGTGTT 

2 6251 CAGTCCCTTT TCACTAGCTT 

2 6301 ATGGAAATCT TCAACGAGAA 

26351 CGGATATAGC GAGGCTGGAA 

26401 GGCCATGCGA ACCAGTTTCA 

2 6451 AACAGATGTA CATCACAAGC 

26501 TGGCGTTTGA CAATAATGAT 

2 6551 CTTGATGTAG ACGCTGAAGG 

26601 TCCGCGACTG GTGCTTACTG 

2 6651 TGAATCTACC TACTGTAGAA 

26701 TCTTCGTCTG ATCCTAGGGT 

26751 GCTCAATGAA TTACGTCGTC 

2 6801 GTTGCTATGA CAACATCGTG 

26851 AACCTTTTGC CTGGGCTGGG 

26901 TCAAGAAGAC CAGAGGTCTT 
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TCTTATAAAA GAGTTGAGAC GTTTTCATAG 
TGATTGATCG TCAAGAGTCT ATAGAAAAAA 
ATATCGAGAG TCTTTTAAGC TAGTAGATTT 
ATCTTAACAC AGGGATTTTT CTAGATCAAT 
CTTAACAATT GGGGTCAGGA TTATGTATTA 
TTATTTAGAA ATTTAATAAT CTCAACTTAT 
ATCTTTTATT TTTGTAATTG TAGTACTATG 
TGGAACCCCG AATCCAGAAG AGGTAACTTC 
ACAAAAATGT TGTTTCTTCA GATGAGGCGC 
CTTCCTATAG TCACTCAACT TTCTCTTCCT 
TGAAGAAACG GCGAGTAATC CGAGGGTAGA 
CTTCGAGTCG GGCGGTTGCT GATCAGATCT 
GGAGAGCTTT TAGACGACCT TAAGGGTGCC 
TCAGTCAGAG TTGAAAAACT GTCTTCCGGC 
GACTCGAAAC TCGAGGTGCT GGGGATAATG 
TTATTTCGTA GCGATTACGA GGCTGTCTTA 
TGGGAAGGCT CATCTCATTT TAAGTAAGTT 
TACAGGGACT CAGTCGTGAA GATCTTTCCC 
AGGGTTCTTG AGCATCTGGG TTCGTTAGGG 
TAATTGGTCT CTTTCTTGTG AGAGGGGGAT 
CTGACAGTAT GCTTGTCCAG ATCAAGAAAG 
GAATTGCGGA CTCTTCAGGG AACAACGGAA 
TGAAGAAAGT TTGTCTTGCT GTGAAAGATT 
TTTGGGCGAA TTTTGTAGGT TTTATTTCGA 
TTTGTTTTGA TGTGGATAGT GAGACGGATT 
GTGTTTGCCT TTCCATAATC CCGATGCTTC 
CTTCCGGAGA GCGTTCTACA AGGAGAGAAC 
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26951 GCCTTTCTCG GCGATCTGAC 

27 001 GAGGGAGAGT CTATACATCC 

27 051 ACCTAGTCGA GGTGATAAGC 

27101 TATAATAAAG AATGATCTCT 

27151 CAGCTTTTAC TCTCACCCAG 

27201 TTGTTTATTG TTTTTTTGAA 

27251 CGATTAAGGA GGGCGCCCTT 

273 01 GGCATTGTTC GCAATGACGG 

27 351 CACAAGCGAA ATCAAGATGT 

27 401 CGTTTACTAG AAGTTTGTGG 

27 4 51 AATCGAGTCC TCGTCTGCTT 

27 501 GAGAGTTAGT CTCTATCTGT 

27 551 GATTTGTCTT TATTACATAT 

27 601 GTATGATTGT GCTGTAGTTT 

27 651 GCTTAGATTT TTTGGTGCGA 

277 01 ATCGTTTTTC TATGTGGAGA 

277 51 AGAGCATTTC TTTGATTCTC 

27 801 GGGAATCTGG TAACCGAGTT 

27851 TTTGTTTGGA TGCAAATGCT 

27901 AGGAGTCAGA GTGACATTTC 

27951 TTGCGAATCG TAAGGACACC 

28001 TTTCCGGGAC GCGTGTTATT 

28051 TGCTTGCAGG *GTCGGGCAGT 

28101 GACCTGGATT TGCTCAAGGA 

28151 TGTCTACATA CTTTAGCGGA 

28201 TATTTCAGAG GGTTGTTTTG 

28251 ACTCCAAAAG CCAAAGAGAA 
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TTATCTGAGG AAGAGATGAT TGTGAGAGCT 
TGAATCTCCC CATGGAGATG GCCGTAACCA 
AAGACTCTGA TAGTGAGGAA GAGACGGAGT 
TCTGAGTAGA AACTATCGAA TCAGAATTGC 
TCCTATTTTT ATAGTTCAAA TTGCGGTGGT 
TTTTTGTTAT ATGGTGGCGT CTAATATTGC 
TATGAATAGA AGAAAAGCAA GATGGGTAGT 
CGCTCATTTC TGTTGGGTGT TGTCCTTGGT 
TCTATTGATA AGTATATTCC TGTAGTCAAT 
ACTTCCTGAA GCTGAGAATG TTGAGGATTT 
GGGTACTGAC TCCTGAAGAA CGTTTTTCTG 
CAGGTTAAAG ATGAGCATGC TTTCTATAAC 
GACTCAGGCT GTGCCTTCGT ATTCTGCAAC 
TTGGCGGGCC TTTGCCAGCG CTACGTCAGC 
GAGTGGCAGC GTGGCGTGCG CTTTAAGAAA 
GCGAGGGCGC TATCAGTCTA TTGAAGAACA 
GGTACAATCC TTTCCCTACT GAAGAGAACT 
ACTCCCTCTT CTGAAGAAGA GATTGCCAAA 
TTTACCTAGA GCATGGCGAG ATAGTACTTC 
TTCTAGCAAA GCCAGAGGAA AATCGTGTGG 
TTACTTTTAT TCCGTTCTTA TCAAGAAGCG 
TGTAAGTAGT CAACCCTTTA TCGGTTTAGA 
TTTTCAAAGG GGAAAGCTAT GATCTTGCTG 
GTCTTGAAGT ATCATTGGGC TCCAAGGATT 
ATGGTTAAAG GAAACGAACG GCTGCTTAAA 
GATGATfCAT GGATCTTAGA GGTTAAAGTC 
CAA.AATTGTA GGCTTTGATG GACAAGCTTT 
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28301 GAAGGTCCGT GTTACCGAAC CCCCAGAAAA GGGTAAGGCC AATGATGCTG 

283 51 TAATTTCTTT ATTAGCAAAA GCTTTATCCT TACCGAAGCG TGATGTCACT 

284 01 TTAATTGCAG GAGAAACTTC TCGAAAGAAA AAGTTTCTTC TTCCTAACAG 
28451 AGTTCAAGAC ATTATTTTTT CTTTGCATAT AGACGTATAG CCTAACTCAC 
28501 AGCTACGTTT TTCCCTTTCT CAGATTTTTC CAATTTTTTG CTTTTAAAAG 
28551 ATAAGAACTG GTTGCGTTCT GTTTTATGAA GCTTGATTCC TAAGTACTTG 
28601 ATGATATCTT CATTAAAGGT GGTTGTAGGT GACAGGCGTT GAGCAATGAT 
28651 TTTACGCAGG CTGTCCACAT CGTGATTGTT ATAGAGTAGG TGGTGCACAA 
28701 TTTTTGCGAT TTGTTTTCCT GATTTTTTGT AATCCACGCT ACAGGCAATG 
2 8751 CAGGCTCCTT CGGAAATTAA GGAGGTATCG TCGGTAATGA TAGGGATTTT 
28801 CTCTTTGAGG ATTTCCTGAA GGAATGCGGT GCCTTCTTTA TGAGAAAGTG 
28851 GGGAGAGGGG AATGAAGATA GCTGAGGGGC GCTTGTCGAT AGCCTGGCGT 
289 01 ATCCGGGTTT TGAATGTACT GCTTGTAATA GAGATCTCAA TGACCTCAAT 
2 8951 TCCTGAAGCA TGGAGTTTCT TAACAATTTC TTTTTGGAGA TCTGAGGGGA 
29001 AAGGTTCGGA GGGTTTTAAA TACACGATAG ATTGTGCATT GGTAGCTACG 
29051 GCTTGTATAG CAAAGCAGTA TTGATTGATG TCTAGAGTGT CATTCACTCC 
29101 GTAGATATTC ATTGTGTTTT TAGGAAGGGT TAGGCTTTCG CGATCAGGAA 
29151 CAGCGGCATA GATCACAGGT TTCTGTGTTT CAATGTGGCT CATGACCTTC 
29201 GTAGCAATAG TTCCTAAGGT GACAATCGCC ACGACATTTT TATCGGTATG 
29251 TAAGGAGCGA GCAATTTTCC TAGCCTTTAC GATACTGTCT TCAGCATTTA 
29301 GGACAACAAT TTCAGGAAGG TTCTCAAAAT CTTTCAAQGT TTCTATACAG 
293 51 CTTTTACTGC AATCTTCTAA TAGGGGATGG GGAAAGGATA AGAAAATTGC 
29401 GATTTTAGGA GAGGAGACGC TATCTGGTTG AGAACCACAA GTGGCTACAT 
2 9451 AGATGAAAGA GCAAAACAGA GAAAAGAAAA ATAAGTACTG AGAGAGTTTA 
29501 CGTAGAATTG TCATGCAAGA ACCATCGTTT CTTTTAGTGA AGCGGTACAG 
29551 AGGCGGTCAC AGGCTAAAGC GATATTTTGT GGTTGTGTCA GAGCGGAGAA 
29601 ACGAACAAAT CCTTGTCCAC AGGAACCAAA ACCGTGGCCG GGAGTCACTG 
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29701 CCTTCAGGGA GTTCTACCCA AAGGTAAGGG GC ATGATCGC CACCATGAAC 

297 51 TGAGAATCCT GCAGTTTCTA AGCTTTTTTT AAGTTTCTGA GCATTGGTTA 

29801 GATATAAAGA GATGGCGGGA GGTGTCGGAA ATAAATCTAG GCCGTAATAC 

29851 CCTGCTTCTT GCATGAGGAG AGATGCTCCG TTAAATGTAG TCGCAAAGAG 

2 9901 CCGTTTCCAA TCGTTGATCA TAGGTTCGTT ATTGTCATAG GTGAGTTCTT 
29951 TAGGGATCAC GTTCCAGGCA AGGCGCATGC CAGTAAAGCC TAATGATTTA 

3 0001 GAGAAAGAGT TGATTTCTAT AGCACAATAT TTTGCTTCAG GGATTTCGAA 
3 0051 GATGCTTTTA GGTAGGCTAG GATCTGAGAC AAAGGCGCTA TAGGCCGCAT 
3 0101 CAAAAATAAG AACGGTTCCG TGCTGATTCG CGTAGTTCAC AAGTGCTTGG 
3 0151 AGTTGTTGAA AGGTTAGAAC TGTTCCTGTG GGGTTGTTAG GATAGCATAG 
3 0201 ACAAAGAATG TCTAGGGATT GTTGGTTCGG AAGTTCTGGA ATAAACCCAG 
3 0251 TTTCTTTTCT GCATGCTAGG GGGATAATGT CGCGGATTCC TGTAATGTGG 
3 0301 GCAATGTCTC TATAAGCTGG ATAGACAGGA TCCTGTAGAC CTAGAGTCTT 
303 51 TTCTGAGCCA AAAAAAGAAA AGAGACGGAA GATATCAGGT TTGGCACCAT 
3 0401 CCGAAATAAA AATCTCTTCA GGGGAGATTC TATTTTCATA GACTTCAGAG 
30451 GCAATTTTTG TGCGTAATTT TTCTAATCCG GTTTCTGGGC CGTACCCACG 
30501 ATAGGTCTCT TGTTTCTCTT GAGAAACGCA GAACTCTTTG ATTGCCTGAG 
30551 TAATAGAGCG GCAGAGAGGT TGTGTCGTAT CTCCGATAGA AAGATCTATG 
30 601 ACAGAGATTT CTGGATTCTC CTTGCGAAAC TGAGCAAGCT TTTTACTAAT 
3 0651 TTCAGAAAAT AGATACTGAG GCTTGAGAAG AGAAAAGTGG GGATTTCTAC 
3 0701 GCATAGCGTG CTCCAGGTAA GGGCGTGATT CACTTGGAAG ACATCTTAAG 
3 0751 AAAGCCCCAG CTTTTTGGAT AGCCATTTTC TGATTTTTCT TTAACCGCCT 
3 0801 CTATAAAGGC AAGGAGTGCT CTGTAGTTTC GGTTAATTAT TAAGATATAT 
3 0851 TTATAATTCT GTTTATTAAA AGTTTTTTAA ATCTTTTCTA ATTTGCTCAC 
3 0901 TATAATTAAA GGATAAGATT TGAAAAAATT TTTTAGGTAA TTATGATAAG 
30951 GGTCAATCCT TATGGAAGTT ATAGGGGTAG GAATCCTTCT CCAGAAGATG 



23 



wo 00/27994 




31001 GGAAAAAGGA TGTACCCCTT 

31051 GGGATTCGTA GAAAGCATAA 

31101 TAAGACGGGG AAAGCTTCTT 

31151 CCCATTTCAA ATAATCCAAG 

31201 GGATTCGTTG GCAGATTTGT 

31251 AAGCAAGAAG CGTTCTCCCT 

31301 GTTTGCGAAC TTACACACGC 

313 51 CAAGAAGATG CTGAGACCTC 

314 01 AAAGCCTGGA AACAAAGACC 
314 51 GCTAATCCGG ATCAGAGCTG 
31501 CAGAGTTTTA GGTCTCTGCG 
31551 CCAGTGGATT TTCAACGGAG 
31601 GG AG ATGTTT GTAGGTCTGT 
31651 ATCAATGAAA AGAGTTAGGG 
31701 TTGCGAAATA AATACTTTCT 
317 51 GCTTTTCACT CTTGCTATTC 
31801 ACTCTTTGTG GGTGGCATCG 
31851 AATACGAAAC TAAGTCTGTT 
31901 TGGAACTATG AAAACGACTG 
31951 GGACATGCAT TTTTTAGATA 
32001 GAGCGTATCT TTTCTCTTTG 
32051 ACCAACTTTG TTAGGGATCT 
32101 GCGTCCTTAT GGTGAATGCA 
32151 TCCCTAAAGA GAAAACTTTA 
32201 GTATGGTTAC CTCAAAATAC 
32251 TCGTCAGGTT ACCAACAGTG 
3 2301 GGAATCGGGT AGGTGCTACC 
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TCAGGGAACT CTCGCTTGCA TCGTCGTGGT 
GAGTGCTTCA GTTGGGGTGA CCTCGGGTTC 
TAGAGAAGAA GGTCAAAGGC ATTTCAGAAG 
ACAGAAGGTT CTCATTCGAA GACAAGCAAA 
TCAATGGATT AGAACGTTTA CAGGACGTGG 
CAAGTTTTTC TCCAACGCAC CCTTACATAC 
AGTCCAAAAC AGAGTGGTGT AGAGAGAAAA 
ATTTATAGAG ACACCCAAAG GGATCTTGAA 
CCAAAGGCAA GCACGTCCAT TGGAAAGACA 
AGATCCCTCC TTCTCTTCAT TAAGAGGGTT 
GTAGAAGGGA GGGAATCCAT AAGGAGAAGA 
ATTCTAGAAG GAAACAAGCT GATTCTACAT 
TGTGAATTCC TCTCTTCACA GATCGGAAAG 
TCTCTTAGTT TTAAGAATTG CTATACAATA 
TCTCTATACA CTGTCTTTAC GATGAAGACA 
TTGGTTTTGT TGGCTCTTTA GCTTCTTGGT 
CTGGGGGAGA GCCTTTGTGC CCCGATTGCA 
TTACGTTCGG ATCAGCTGCC GGATCATCTC 
TTATCTTACA GGTTATGTGC AGTCTCTTTT 
GCCGTACGCA AGTTGTTATT GAGAAGAATA 
CCTGTAGATT CGAGTTTATC AGAAGCCATT 
TCCCTTCATA TGTGCTGTGG AGATTTGCGA 
TAACGAGATC TTCTGCGGAG CGTCCCTTAC 
GGAATGCCAA TTTTCTGCGG CAAAGAAGGG 
CATTTTGTTT TCTCCTTTGA TTGCAGATCC 
CTGGCATTCG TTTTAATGAG AAGGTCGTGG 
ATCTTTGGGG GAGATTTTAT TCTCCTGCGT 
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32351 CTTTTTGATG TTTCTCGATT CCATGTAGAT TGTGATTTCG GAATTCAAGG 

32401 AGGAGTCTTC TCAGTTTTTG ATTTAGATCA TCCTGAATCG TGCATGGTAA 

324 51 ATTCAGATTT CTTTGTTGCC GGACTCTGGT CAGGGGCTAT AGATAAATGG 

32501 AGTTTTAGGT TTCGATTGTG GCACCTCTCG TCCCATTTAG GAGATGAGTT 

32551 TATTCTTACG CATCCAAATT TCCCAAGATT TAATTTGAGT GATGAGGGCG 

32601 TCGATCTCTT CATTTCGTTT CGTTACACAC CACAGATCCG CTTGTATGGC 

32651 GGCTGCGGTT ATATTGTAAG TAGGGATCTT ACTTTTCCTG AGCGGCCGTT 

32701 TTACTGTGAA TGGGGTGCGG AACTCAGACC TTTTGGTCTG AGAGAAGGAA 

32751 ATCTCCACGC ACAACCGATT TTCGCGATGC ATTTCCGTTG TTGGGAAGAA 

32801 CAGAAATTTG GCTTGGATCA AAGCTATATT TTAGGCATGG AGTGGGCCAA 

32851 ATTTCAAGAA ATCGGAAGGA AAATCCGTGC TGTTTTAGAA TATCATCAGG 

32901 GATTTTCTAA AGAAGGCCAA TTCATTCGTG AACCGTGTAA TTACTACGGT 

32951 TTCCGTCTTA CCTATGGATT CTAAACGATA ACAAAACCAT CTTCAGGGAC 

33001 AGGGTGTTCA GGTCCTATGG GCAGCGTATG ATTGAGATAT CTGCGGAAGA 

33051 TTGCAATGCC TGCCTTTGCT GAGGATAGGC AGAATAGGCA GTTGTGTACC 

33101 CATTCAGAAC CTCGAATGGT TCCTACAGCA TGATTGGAAT TATACAAAGC 

33151 TGTGATCTTG GATTTCCAAT AGTCTACAGG TCCGATTAGG AAGACGGGAA 

33201 CAAGAGCTTT TTTCCCTGTT TTGAGACTAA TAAGCTCCAG AAGGAGTTCG 

33251 AAATCGGTTC CCATGCCTCC GATAACAAAT ACAGCAAGGT CGACATGGAA 

3 33 01 GTCGGCCTGA CGTTCTAAAA GATCAGGAAT AGCATAGCTC ATTTTAGCTT 

33351 CTACATAGGC ATTCGTGGTA TCCAAGCTAA TTAGATTCCC ACAAGAGAGT 

334 01 ATGGAGAGTT CTGTAGCTAC ACGATTCGCG AGTTCCATAG CTCCAGAACC 

334 51 CCCTCCTGTA AGGATTGCTA ACGGTGTCTG TGGTGGAAAT TCTGGGATCG 

33 501 TGAATTGCTG AGAAAGAGTA TGCATTCCTG TCAGGAGCTC ACGGAGAAAC 

33551 TCATCATAAT CCCCAGCGAT TAGGCAGGAA CCATGAATTC CTATAAAGTA 

33 601 GGATTGAGCA AACTGTTCAG CTTGATGTTT AGGGACAAAC ATGCCCACAT 

33 651 CTTTATTTCT GCGTTTGATG TATTGTAAGA GTCGTTTCGA TTCTAAGTCT 
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33701 GCCCAAAATA CAGAAATTCC TGCAAAATAT AGATCGAGAA GGAAAGAGCG 
337 51 ATCTCGATTC GAGAAAAACT CTCCAGAAGT GGGAGAGGGA ATCTGAAAAT 
3 3 801 AGATATGTTG CAGGTAATAG CGAGAGTAGT TAGAGAGGAA CATGCCCTTC 
33 851 AGCGATGCTG AAGGGAAGTA GCGGGAAAAT AAAACTCCTT GGCTTGTGAT 
33901 ATGATCTGTT TCCATGGCTT TTAAAAAAGG GAAACAAGGT TGGTCTTCAA 

33 951 TGTGCTTTTG AATTTCCCTA GCATGTCTTT CATCTGATGG GGAGATTCGA 

34 001 GGTTTGATGA TCCAAGAGTC TTGGGAGAGC TCAAGCAGCT CACTACCTTT 
34 051 GGAGATAAAC ATCGCAGCTT GATCTTCGCC TTCCGGTATG GATTCAAAAA 
34101 CACGAAATAC CTCTTGAGGA GATTCTAAGG TTTCCTGGAG CATATCTCTA 
34151 TAGAAGAAAA ACGAATGCTC TTTGTAAGGC TCAAGAGTAA AAAATTCTAA 
34201 AGGTATTCTC TCAATAGGTT CTGAAGTGCT GCCGTAGAAT TCATAAATAT 
34251 CTCCAGATTC TTGTGTGGTA GGTTCGAGAA TATCCGCTGC GGTGTGACGA 
34 3 01 AGCCCTTGGG GGAGTAAGTC CTGAACGACT CTTGCAAATA CGGTTCGGAT 
34 3 51 GTGCAGAGGC TCTGTCTTTA TGAGAAGAAT TTTATGATCT TCGGGAACGG 
344 01 GAGGACGATC TGTTACCATT TGATACAAAG GAAGAAACTT ACGTATTTTT 
34451 AAATGGGGAC GCGTGAGTGA TTTGCTCATT AAGGGAAGGA ACCCATAAAT 
34 501 TGTCTCTTCG TAACAGATTG TTCCTGGAAG GATCGGAAGG AAGACAACAA 
34 551 GCCGATCATT AATGATCTCT AGAGTGATGA AGTGCTCAAG TTTTTTCCCA 
34 601 AAGCGTAGGA GCGGAGATCC TGTACGGTCT GTGTGCGTAA ACATCCTGTT 
3 4 651 GAGATAACAA GGCGAACGTA CGAGTCGGCG ATCATCAGCA GCAAAGAGCT 
34701 TGCAGACAAA ACTTCCAGGC TCTAGGAGCT CCAACATAGC AGTGGCTATA 
347 51 GGATCTTGGC TCATGAAGAG AACGTGTAGA CGAGCTTCTT TTCGGGCTTT 
34801 ATTTAGCTCC AAGTGGTTTA AAACGGCTTC GACACCTAGT TGGGCTAAGG 
34 851 AACTTTTTAA ATTTACTTGT ATACACTGTT GAGGCAGATG AAATCCAAGA 
34 901 AAGTACGCAG GAATATTCTC AATGAGGACC TCTCCTTCGT AAATGTGGGG 
34 951 CGAGAGTTTT TTCAAATGGG AAACGAGTCG TCCGTCTGGG GAGGCTGCAT 
3 5001 CATGATGCGC GTGGAGTAGG TTATACATAA TTCATCGTAT TTTAAAGTTA 
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3 5051 AAAGCAATAG GTAAGGTCCC 

3 5101 AAGAATTTCC TACAAGGACT 

35151 ATTTGTCAAG GAGGGGGTCT 

3 5201 ATTTTAGGAA AAAAGTTTCT 

3 5251 GAGATTGAGT AAAATGTAAA 

3 5301 CTCTGGTTAC AATGGAGGAT 

3 53 51 CCAAGAAGCA TTGGTACTCA 

3 5401 AGCGTGTGCT CTCCTTATTA 

3 5451 TACGCTCTCG AGATCCTGTC 

3 5501 GTCGGTGGTG TTTATTCTAT 

3 5551 CTCTTATGAT GGATCTTGGA 

3 5601 AAGAGTTTGG TTATATGGAT 

3 5651 TTGGTACATG GTGTGGATGA 

3 5701 GGGATTTTGT TCGTTTTCTG 

3 5751 AAGAAGAAAC TAATTCGGAT 

3 5801 ATCGACTTTT TGTGTCGGCT 

3 5851 TAGGGGGATT GTCAGAGAAG 

3 5901 TTGATCGTCC TTTAGCATGG 

3 5951 AAGCACCCTG CAGCTTTTGT 

36001 ACGAGGGATT CCTCCGAATT 

3 6051 TCCCTGAGAA TTGGGCAGGT 

36101 GGGATTCCTG GGGCTGTGTT 

3 6151 GACAAATAGA -GAAACTTGCC 

3 6201 GAGGGATCAT ATGACAGTAT 

3 6251 GTGAAAAGGT ATTTGAAAAC 

36301 CCTCAAGCTC CTGTTCATCT 

3 63 51 ATTTCAGGAT ATCCCAGGGG 
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TCTTTGCGCT TCTAAGACTA GCGTTTCCAG 
TCATTCTGCT TGTTGTTATA ATTTTGTGCA 
TGGAACGAGA AAATTGTGTT GTGGAAGGGA 
TAGAAAATAG GGCAGAAACT TTCTCTTGGA 
GAATAGTCTT TGCAATTGAG AATTATTTCT 
TGGCTAAGGA GGATAGTAGG TATGCAGATT 
CGATGGTTCT TTCCATGCGG ATGAGGTCAC 
TTTTCGATCT TGTGGATGAA AATAAAATTA 
GTATTATCGA AATGTGAATA TGTTTGTGAT 
AGAAAACAAG CGTTTTGATC ATCATCAAGT 
GTAGTGCAGG TATGATTCTG CATTATCTTA 
TGTGAAGAAT ATCATTTCCT TAACAACACT 
ACAAGATAAT GGCAGATTCT TCTCTAAGGA 
ATATTATTAA AATTTATAAT CCTCGCGAGG 
GCGGATTTTT CTTGTGCTTT GCATTTTACC 
AAGGAAGAAG TTTCAGTATG ATCGAGTTTG 
CCATGGAAAC CGAGGATATG TGTTTATATT 
CAAGAAAATT XCTTTTTTTT AGGGGGAGAG 
TTGTTTTCCT TCCTGCGATC AATGGATTTT 
TAGATCGCCG TATGGACGTT CGTGTTCCTT 
TTGTTAGGTA AAGAGTTGTC CAAAGTATCA 
CTGCCATAAA GGTCTTTTCC TTTCTGTATG 
AACGTGCTTT GCGGTTAACG TTACAAGATC 
TCAAACAAAT TATCGATGGA TTGATAGATT 
GAAAATTTCA TAGCTATAAA AGATCGTTTT 
TCTTATCATT CCTAAAAAAC CTATACCACG 
ATGAGATGAT TTTAATGGCA GAGGCTGGAA 
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3 6401 AGATCGTGCA AGAGCTTGCT 

36451 GTGGTTATCA ACAACGGTGC 

36501 TATTCATCTT TTAGGTGGGC 

3 6551 TTGTTTCTGT GGATCCTTGT 

3 6601 TGATCCAAGA TTTTGCAAAA 

36651 TCTAAAGAGT ATTCTTTACT 

3 6701 TCAAAATTCT TCTTTTGATG 

3 6751 TTTCCTATCC AGAGTTAGCT 

3 6801 ATTCAAGTTC TGCGTGAGGG 

3 6851 TGTGAGTGTC CTTGCTATTG 

3 6901 TCCTGCTCCA AAGTTGTAAT 

3 6951 CTTCAGGTTG CTGTGAACTA 

37 001 AGAGCTTGCC CGTAATGATG 

37 051 AGGTGGTCGC TCTTTTACAG 

37101 CGTGCTGAGA ACAAACTTGT 

37151 GGCTTGCTTG GAACTCTCTT 

37201 ACGATATTGA TCAAGCGTTG 

37251 TTGCCAGAGA CTACTGAGAT 

373 01 TGAAGTGCAG GAGTCTCTCT 

37 351 TACAGAATCA CAAAGAGTTT 

37401 TCTCCATTTG CAAAAGTACG 

37451 TGGAGACCCT TTGGGCAGAG 

37501 AACCTCTTGT GTGTGAGGCA 

37 551 CATGGAGTCC CTTTGGCAAA 

37 601 GGCTGCTGCG AACCTCTCCA 

37 651 AAAGAGCTGG AGATGTGATT 

37701 TGGGCTATAG AGTATTTCTT 
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GCAGAATTTG GAATTGCCGA TGGGTATCGT 
TGAAGGAGGA CAGGCGGTAT TTCACTTACA 
GTCCTTTAGG TGCTATAGCC TAATTTTCTT 
TCGGCTCGGA GTCCCTCCGT TATCAATTGT 
GTTTCAGAAG AGGGCATAGG CCTTTTGGAG 
TCAGGCTAAG CTAGTTTTAA GGGCTCTGGC 
ATTGGTTTAG AAGTTTTAAG AAGTGTCAGA 
CATGATCGCG ATGTCTTAGA AGAATTTGGG 
AATCGAAAAT CCTTCCGTGA CCGTTCGTGC 
GGCTTGCTAG AGATTTTCGC TTGGTCCCTC 
GATGACAGTG CTATTGTTCG ATCTTTGGCT 
TGGCTCTGAA AGTTTAAAAA AGGCCATTGT 
ATTCTATTCA TGTTCGGATT ACAGCATATC 
ATAGAGGAGC TATTGCCATT TTTAAGAGAG 
AGATAGTGTA GAACGTCGAG AGGCGTGGAA 
CTCAATTTCT AGAGACGGGT GTAGCTAAGG 
TTCACTTGTG AAGTGTTGCG TAACGGTATG 
TTTTACAGAA CTCTTATCTG TAGAGCATCC 
TACTTTCTGC TTTAGCTTGG AGTCATCAGC 
CTTAGTAAAG TGCGCCATGT GATGTGCACT 
TTTTCAAGCT GCTGCACTTC TCCATCTGCA 
ACTCTCTGGT TGAGGGCTTG CGCTCTCCTC 
GCTTCGGCGG CTCTCTGCTC TTTAGGAATC 
GGAGCATTTG GAGAGCCTTT CTTCTCGAAA 
TTTTGCTTCT TGTGAGCCGT GAAGATATTG 
GCTCGCTACC TCTCCAATCC TGAAATGTGC 
ATGGGATGCA CAATGGAATT TACGTGGTGA 
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377 51 TACCTTCCCT CTATATTCGG 
37 801 TCATTCGCCT TTTGGCAGTA 
37 851 GCAACGTTCC TTTCAGGACA 
37901 AATGTTCTGG GAAGAGGGAG 

37951 ATGCTTGCTT TGCAGCAAAG 

38001 AAAAAAGATC AAGCTTCCCT 

3 80 51 CCGTTGGCAA GATAAATTAG 

38101 ATCTTGATGC TGTGCCTTTT 

38151 TCGCTGCGA.z\ GTGCAGCAGC 

38201 ATTAATAAAA TTATTCAAGA 

3 8251 CTATTTCTTA AAAATAATTT 

3 8301 GTCTTTGGAG ATAGAAAACT 

38351 TAATTTATGA AGAGCGTAAG 

38401 ATATAAAATC AGATTTGTTT 

3 8451 TATTCTGGAC GTTATCTCCA 

3 8501 TCTTTAAGCT CAGCACCGTA 

3 8551 GTTGTGGAGT TCTTTATTTT 

38601 ATAGCTATTC AGGGAATCTG 

38651 GGTAGAAGAT CGCGAGAGCT 

38701 TTCCTTTGAT GATGACATTC 

38751 CTTGGGGCTG CTGAAGAGGC 

38801 TCCTTTAGGA GTCGCTTTAG 

3 8851 AGCGGTTGGC AGAGGGATCG 

3 8901 TTTTATGGTT CTAAAAAGCT 

38951 TTCCTTATTT TTCATCCTGG 

3 9001 TTAGCAGCTT CCCTTTTGGC 

39051 TTAGCATCCT ATACCTCAAC 
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ATATGATTAA ACGTGAGATT GGTAGGAAGC 
GCTCGCTATA GCCAAGCCAA GGCTGTAACA 
GCAAGCTCAG GGATGGAGCT TTTTTTCTGG 
ATGTGAAAAC TTCTGAGGAT TTGGTTACAG 
TTGGAAGGAG CGTTAGCCTC GCTATGTCAG 
ACAGAGGGTC TCTCAACTTT ATAATGACAG 
CAATCTTAGA GAGCGTTGCT TTTTCTGAGA 
CTTCTAGACT GCTGCCATCA CGAAGCTCCT 
GGGTGCTCTT TTCTCTATTT TCAAATAAAT 
TATAGAAGAA AAACCACGCA GTTGTAATTT 
TTCAGACTGA CTTTATTCTT TCATTTTTAA 
TTGTTATAGA TTTTTATCTG GTAGCTTTTA 
CTCAGAGCCT GTATGTCATG CACAACCTGT 
TTTGAATCTC TATTCTCGTT AAGATTTCGT 
TCACCACTCC TAATTTTCCT AGCATTTCTA 
GCTTGCTTAA AGGAAATATT TTTCATTTAG 
ATGAATTTTT CATTATTTTT ATTTTTCCTG 
CTTGTACGTG GGACGTCGTG GTAGCAAAAA 
ATTTTCTTGC AGGAAGGAGT TTAAAAATCT 
ATTGCCACCC AAATCGGTGG CGGTGTACTT 
CTTCTGTTAT GGTTATGGGG GGATTCTTTA 
GGTTGATTTT CTTAGGAATG GGGCCCGGGA 
TTAACGACCG TAGTCTCTAT CTTTGAAGTG 
CCGTAAGATC GCATTTTTAT TATCCGCAGG 
TCGCTCAGGT GATTGCTTTA GATCGGTTGT 
AAGTACGTAA CCGTAGCATT TTGGATTGTC 
AGGAGGGTTT CGCGGGGTCG TACGTACTGA 
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3 9101 TGTGATCCAA GCAGGATTTC 

39151 CTGTATGGCT CTCTGTCCCT 

39201 TCACTTCCTT GTGCGAAGCT 

39251 TATGCTTGTT GAGCAGGATA 

393 01 CAAAACGCTT GCAATGGGCG 

39351 TTTAACTTTA TCCCTTTATT 

39401 TAAAGCAGGA TGCCCTCTGA 

3 9451 CACTAGCAGC TGTGATGGCT 

39501 GCGGACTCTC TTATGAATGC 

3 9551 TACGTTGAAA GCCCCTTATT 

3 9601 CAGCTCCTCT TGTTGCTATT 

39 651 TTAAGCTATA GCCTGTCAGT 

3 9701 TCTTCTAGCT CCTAAAGGTC 

39751 GAGTGCTCGT TGGTGCTCTG 

39801 GGGATGTTTG GGGAGCTATT 

3 9851 CTTTGTAGGA TTTATTGAGA 
39901 CTTAGATAAC CACTGCATGA 
399 51 TTTAGGTTTC TCTTAGATCT 

4 0001 TTTTTTAAGT TGTGTTTAGA 
4 0051 AAGAGGGGTT CGTTTTATTT 
4 0101 GTTTTAACTG ATAGAGCAGG 
4 0151 AGCGTCTCTT CAGAATGGTG 
4 0201 CTTTAATAGC CAATACAGAG 
40251 TACACGCTTG TTTTATCATA 
4 0301 CTTCCAATTG GCTGTTAGAT 
4 0351 ATATTTATAT TAAATATAAT 
4 0401 GTATTTGGCT AAGTTTAATT 



TTCTTATTGC GGTGCTCGTC TGTGGTGTTT 
AAATCCTTGT CTGTGTTGGA TCCTTTCCAA 
TTCCAATTGG ATATTCATGC CTATGCTCTT 
TGGTGCAAAG GTGTGTGGCT GCCTCCTCTC 
GCTGTAGGCG CAGGCCTTGT TCTTCTTCTT 
TTTAGGTTCT TTAGGAGCTA AAGCAGGCCT 
TTGATACCAT TGCATATTTT TGCAATCCCT 
GCTGCCATCG GCGTTGCGAT TCTCTCTACC 
TGTAAGCCAG CTAATCGCTG AAGAATACCC 
ATCGTTATTT AGTATTGGGT TTGGCGGTTG 
GGTTTTACAA ACATCGTAGA TGTCTTGATT 
GTGTTGTCTT TCAGTCCCTG TGGGTTTCTA 
GCCGTGTGAG CGGAGCTGCT GCTTGGGCAG 
GGCTATGGAT GGGTTCAGAT AGTCTCTTTG 
GGCTTGGGTA GGTTCTCTAG TCGCCTTTTC 
TCACTTGGAA AAACAAAGTC AAAACGCAAA 
GAAGATATAA CTAAAATAGA TCCTGAGTTG 
GATAGGTTGC GCTTAGTAAG AGATCGTCAG 
ATCTGATACC TCTCCTTCTT TTCCAAGAAG 
TTTATTATTC ATTCGTAGGG GCGGGAAGCT 
TCGATAAGAG AGGAGAGCAG GGCTTGATAG 
AAGAGTGCCC TTAAGAAGCT TTCTGCAATA 
CGCAAAGTAA ACAC.TGTAAA AGGTAAAATT 
AACCTCCAAA GAAAGACGCG TGTTGGCGTT 
AGTGAAAATA GTATTTACTA TTCAATAAAA 
AAAACAAAAT TTCTAATAAA CTTTTTAAAA 
TAAGAGTTCT CAAATAAAAG ATTTTTTAGT 

30 



wo 00/27994 




PCT/US99/26923 



4 0451 CTCTTTTATT TGAGAATCAT CAACCGATTT CAAGAAATCG CATGAGCCCG 

4 0501 TGATTCTGAT GGGCAGAAGT GCTTTTTAGC AAATAGGTGA AGCTCCTCGG 

4 0551 GCGTGATCTG TTGCTTCGCG TCACAAAGAG CCTTTTCAGG GTGTGCATGC 

4 0601 ACTTCGATCA TCAGACCGTC GGCACCTACC GAGAGACCAG CAGAGGCGAG 

40651 AGGAAGAACT AGAGAACGCT TCCCCGCTGC GTGGGAAGGA TCTACAATTA 

4 0701 CAGGGAGAGA AGAGATCTCT TTAAGGAGAG CCACGGTATT GAGATCTAGC 

4 0751 GTGTAGCGCG TAGAGTGCTC AAAGGTACGA ATTCCTCGTT CACAAAGGAT 

4 0801 TACCCCAGGA CAGGAGGGAG AAGAAGCAAG GATGTACTCC GCTGCGCATA 

4 0851 GCCACTCTTC AAGAGTAGCT GCTGGACTGC GTTTTAGGAT AATCGGACGA 

4 0901 TGTGATTTGC TGACCTCTTG TAAAAGAGGG GTGTTATGCA TGTTTTTGGC 

4 0951 TCCGATACGG AGGATATCCA CATGTTCGGC AGTAATTTCA ACATCTCGGA 

41001 CATCTAAAAC TTCGGTTTCT GTAGGGAGAC CATGGATGCT CTGTGCTTCC 

41051 TTATGCCAAA GCACACACTC TTTCTCCCAT CCTTGAAACG AAAATGGGCT 

41101 TGTCCGTGGT TTTCTGATTG ATCCTCGGAA TACCTGAGCT CCTGCTTCTT 

41151 TAACTGTAAG AGCTGAAGAG ACTGTATGCT CGTAACTTTC TAAGGTGCAG 

41201 GGGCCTGCGA TCAGTATTGG CGATCCTTCT CCAAACGATA GATTTGGAGA 

41251 AATAGGAACG GTATGGACCT CGTCAGGATG CTGTTTGAGG GTGCGCGGTA 

41301 GGGGATAGGT AAACGTAAGA ATAAGTACCT CATGCAAAAC TAGGGATTTG 

413 51 AGCTGTCCGG TTTTGAGTTC TGATCTTCAG TTCTAGGGTT TCTAGGCAAG 

414 01 AGACAACCGT AGTGGTTGGG GAAGCGGATT AAGAAGATAC GAGCGTCTCT 
414 51 TTCCGGAGAA ATTGTATTTA AAAATCCATG CATCGCAACA AAGTTTTCAA 
41501 TATTTTCTTC TGAGACAGTA TCTCGATCAG ACATGATCAG ACAAACAGCA 
41551 AACGGCTGCA AAGGTACAGT AGTCATAGCA GCGACGTCGT TAGCCTCAGC 
41601 TTGCTCTTTG ACGTCATCCA AGATATCACC GAGTTCATTA GAACTTAGGG 
41651 AGTGGAGGTA GCCTATCAGT GATTGTCTAA AACCAGAGCG ACAGAAACCA 
41701 TCAGATGCTG TGACCGTGTT TTTTAAGAGA TTACAAATCA ACTGATCTCG 
417 51 AGCTCTAGAG TATTGATTTA TAGTCGCTGT AGCTAGAGGG AAACCGTATT 
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41801 TTAACGAAAG TAAAAGCTTA TCTTTTACTT GTGTAAATCT ATGAGAAGCC 

41851 TCTCTTTGGA AAGCGTCTAA TCGAGCAGCT AGCGGAGTCT TTTTTTGTAG 

41901 AAGTTTAGGG TAGTTGTTAA GGAAACAAAA GATTAAAAAA GATTTCGAAA 

41951 ACTGATCACT TAAAGCTGTG GGAGAAGAAG GCGAACCCAG TTTTGATCCG 

42001 AAGAGATTTA ACCGTGTTAA AAAAACGTTC CACCCATTCC TAAAGGTACT 

42051 TTCGATTTCT TTTTCTAAGA GTATGGCCTC GGTTTCCAGT TGATAATTTG 

42101 GGAAAGTAGC TCGTAGAAAC TGCATACAGT TCTTTTCAAG TATGGGATCT 

42151 AAAACCATCA CTTCAGGATG TCGTGACCAG AACTTGCAGA ACGTCTCTTT 

4 2201 ATCTTTATTA GAAGGGGTTG AAGGATAGCG TGCTGGAGGC TTGAGTACTA 

42251 GAGACCCTAG AATGGAACGT AGAAGTTTTT GGAGTTTCTT TTCGTTTTCT 

42301 ACTTTTTTAA AGGCATATTG TAGAAAAGGA AGTAGAAGCA TTTTTAACTC 

423 51 TTTAACGGCA TCTTTAGTTT CTGGACGATT ATTCACTTGA GGATGCTGTA 

424 01 AAATAAAGAA TAAGAGCTGT ACTGCCTGTG TATAGAGGAG GTCCTGCTCT 
42451 TGTGAGAGAC CACTGCCTTC ATCACGCGTT AAACGTAAAC TTCTTGCCGA 
42501 GCAGACAGTA TTGAAGTGCT CTTTTAATTC AAGCTTACTA TGAATTGCGA 
42551 CTAATAGCTT GTTTACAAGA TTTGAAAGTA AATCCGGCTG ATTGCTTAAA 
42601 GGAGGGGTTT CAAAGGATGT GGATATTACA TTCTTTAATA GGTTTGCTAT 
42 651 GCGTTTATGT AGTGTTTCGA GTTGAGGCAT AGAGCTCTCT AACGTCAAAC 
42701 TCTCATGAGT CGTAAGGATA GTCATGATGT TAGAAGAAAT CACCTCTCGT 
42751 TGCCAAGAAG AATCTAAATC TTCATTAAGG TAGCTGAGGG AGGCGTTACT 
42801 TATTTGTAAG ATCTCTTGTA TAGAGCGTTG GGGGGGCGCG GACGCTATAA 
42851 TGATATCCTC TAATGATTTG TAGAAAGTAG CGCAGAGTTT GTGAAATCTT 
42901 ACAGAGGGGC AGGTACATAG AAAAGCAAAG AAAGAAAGAG CTGGAGACCA 
42951 AGGGAGATCC CCAGATCGTC TGTTTTCTAT AGTTGCAAGA AAGCGAGAGT 
43001 ATTGACATCG GATTGCTTCA GGGAGGTGAT CTTTGACTAT AGCATTGAAT 
43051 CTCTCAGTAT CCCACCCTGA ATCAGCAATG CGTCGACTCA GCTTAGCAAA 
43101 AGCCTCTCGT ATTTCTCTTT CATATTGCTT TCGAAGTTGT TGATCTTCTG 
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4 3151 GAGACATCTT TTGAAGCTCT 

43201 TCCAAGAGAA GAAAGAGTTT 

43251 ATGATCAATG AGATTTTGGA 

43301 GAAATTCCAG AACACTGGGG 

43 3 51 AAGGCTCTCG CTGTTTCTTC 

43401 CGCATATGCA GAGAGTTTTC 

434 51 TCTGAGAAAG AATCACCCCG 

4 3 501 CTGCATAACT GAAGGAGTTC 

43 551 CGGAGAAGAG GAGGCGAAAG 

4 3 601 CAAAGACAAT ATCATTTCTG 

4 3 651 CCTACAGCAT AGGCACGATA 

4 37 01 AAGGTAGTAA TTGTCATTTA 

4 37 51 CAAGTCGCCG GTGTTGTTGA 

4 3 801 AACTGTTTTA TTTGGAAATA 

43 851 GGAAGCTACG AAATGAGGGT 
43901 TGGCAGGAAG AGGCAGAGGC 
4 3951 TTTAAGCATT GCCAATAGCC 

44 001 AGCCACCCGT ACGGGAGGAG 
44 051 TATCAGCAGG TTGTTTAGGG 
44101 GGGGGAGGCG GGGGTGTGTC 
44151 ATCGGGTTTT TTGTCTTCAC 
44201 TAGTTTTTGG CTCTGGTCCT 
44251 TCTTTTCTGA CAACCCGATG 

443 01 ACCAAGAGTA ATGATATGGA 
4 43 51 TAAGTAACAG AGGGTCTTTA 
4 4401 TTGTTATTGG GCGGGCAATG 

444 51 ATTGCTCGTT TTTTAATCAA 



GCATCAGTAA GAAAGAGAGC GGGCAGATGT 
GTCTCTAGAT TGCATGTAAG GAAGTGCGGA 
ACAATTCCGA ATAGGGACGA TTTGCAGCAA 
AGGAGATCGT CCTGCATATC AGAAATAAAT 
ATTACTTGAA GCTGCGATTT GTTGGCGAAT 
TTAGAAAGGC TATCAGAGTT GCAGTGTGTT 
TCATAGAGGT CTATGAAGGA GCAATAAGTA 
AGCCATTTCT GCACAAAGAT TCGCATTTGC 
GCAGGTCAAG GAGACGTGTG GCTTCCTGCT 
CTGCTCTCTT CGTAGAGAGC AGATAGCCAT 
AAAGCAGTTC CCATCTCCCG GTACATTCAC 
GATAGAGAGC CTGTTCAAGA GAGAGTTGCG 
GGAAGATCCG GATTTTGTGC GATTTTCTTG 
CATCGGTTCG TTATCAATCC GATTGGGATA 
TAAAGTCGCC AAGTATTGGA TCAACTTGCA 
GCTCGTCTTA GTACCATGCT CACCATGCGA 
TTGACTAGAT GGGCGGAGAG GCATGGGGGT 
GAGGGGCCTC TGGTGGTGGA GTCGGCTTTT 
ACTTTTGGGC TCGCTGGTGA AGGAGCTTTG 
CTCTGGGGGC GGCGTGCCCG GCTTGGGAAC 
CATCCTTAGG CGGTTGTTTG GCAATTTCTA 
TTGGGAAGAG TGGGAGGCGT TGGCAAGCCT 
ATGCTTGTAG TAGTGAATCA GAAGAAGCAA 
GCAGAACGTA TCCTATGGTA CGTAGAATTC 
GTATCAGTCG TTAAGTGGTA AAAATTATTC 
TGGTGGGGAA TTGGATACAT ACGTTCAAAA 
AATTATTCAA AGTTAAAACT TTTTCGAGTT 
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44 501 TGATGTA"^TT ATTTTTTTAT GTAAACTTTA ACACATGATA GAATTTAGCG 

44 551 TATAGAGCGC AAACTTTCAT GATAAAACAA ATAGGCCGTT TTTTTAGAGC 

44 601 ATTTATTTTT ATAATGCCTT TATCTTTAAC AAGTTGTGAG TCTAAAATCG 

44 651 ATCGAAATCG CATCTGGATT GTAGGTACGA ATGCTACATA TCCTCCTTTT 

44701 GAGTATGTGG ATGCTCAGGG GGAAGTTGTA GGTTTCGATA TAGATTTGGC 

447 51 AAAGGCAATT AGTGAAAAAC TTGGCAAGCA ATTGGAAGTT AGAGAATTCG 

44 801 CTTTCGATGC TTTAATTTTA AATTTAAAAA AACATCGTAT CGATGCAATT 

44 851 TTAGCAGGAA TGTCCATTAC TCCTTCGCGT CAGAAGGAAA TCGCCCTGCT 

44 901 TCCCTATTAT GGCGATGAGG TTCAAGAGCT GATGGTGGTT TCTAAGCGGT 

44951 CTTTAGAGAC CCCTGTGCTT CCCCTAACAC AGTATTCTTC TGTTGCTGTT 

45001 CAGACAGGAA CGTTTCAGGA GCATTATCTT TTATCTCAGC CCGGAATTTG 

4 5051 TGTCCGTTCT TTTGATAGCA CCTTGGAGGT GATTATGGAA GTTCGTTATG 

4 5101 GGAAATCTCC GGTTGCCGTT CTAGAACCCT CGGTAGGACG TGTCGTTCTT 

4 5151 AAAGACTTCC CTAATCTTGT TGCAACAAGA TTAGAGCTCC CTCCTGAATG 

4 5201 TTGGGTGTTG GGCTGTGGTC TCGGCGTAGC TAAAGATCGT CCTGAAGAAA 

4 5251 TACAAACGAT TCAACAAGCG ATTACAGATT TAAAGAGCGA AGGGGTGATT 

4 53 01 CAATCTTTAA CCAAGAAATG GCAACTTTCT GAAGTTGCTT ACGAATAGAG 

4 53 51 GGTATTCTTA TGGCAACCTC TGTTCCTGTA ACTTCATCTA CTTCTGTAGG 

4 5401 AGAGGCTAAC TCCTCCAACG AAAGATTTAC TGAACGAACA TCGCGAATGT 

4 5451 ATTACGCAGC TTTAGTCCTA GGGGCTTTGA GCTGTTTAAT TTTTATTGCT 

4 5501 ATGATTGTCA TTTTCCCACA GGTCGGATTG TGGGCTGTGG TCCTCGGGTT 

4 5551 TGCTCTTGGA TGTTTACTTT TAAGCTTAGC TATCGTTTTT GCTGTCTCCG 

4 5601 GTCTCGTTTT AGGCAAGACT TTAGAACCTA GTCGAGAAGC GACTCCTCCA 

4 5651 GAAATTGTTG CGCAAAAGGA GTGGACTACA CAACAAGATG TCTTAGGGAA 

45701 TGAGTATTGG CGTTCCGAGT TGATTTCCTT GTTCTTACGA GGGGATCTCC 

4 5751 ACGAATCTCT GATTGTTGAT TCTAAGGATC GATCTTTAGA TATTGATCAG 

4 5801 AGTTTACAAA ATATATTGAA ACTTGAGCCC CTATCTACGA CACTTTCGCT 

34 



wo 00/27994 



PCTAJS99/26923 



45851 GTTAAAGAAA GATTGTGTCC ACATCAATAT CATTTTACAT TTAGTGAGAC 

4 5901 AGTGGAACTT ACTGGGAGTG GATCTTAGTC CTGAAGTCAC TGCGCACGCC 

4 5951 GAGGAACTTC TACTCTTTTT GATAGAAGAG CAGTATTACT CTCCTGATAT 

4 6001 TTTGAAATTG ATTCGCTACG GAGATGCTTT ACAAGCAACG TCTCCTTTGA 

4 6051 TGGATTGGGC AGATTCAGGT TCCTTTAGTG TAGACGCAGA CGGGGTATTT 

4 6101 AGCTGTCGCA GAGAAGAATG TTCTCCTGAG GATGCTTTGG CGCAATTCGA 

4 6151 TCTTCTTTTG GCGTTGGAAA ATCCCGACAG ACGCTTCTTA AAGGATTCTT 

4 6201 TTCTTACCTA CATTTGGTCG TCTTCATTTT TTGAGAAGTT TTTACATCGC 

4 6251 CATCTAGAGA GCTTGCAAAG AAAGCTCCCA GAGACAGCGA TCGATGTCGC 

4 63 01 CCGCTATGAA GCACAAATAC AAACATTTCT CTCTCGCTAT TTTCAGAAGC 

4 6351 TCGATTTGAT AAACGCAATG TCCTTAGATT GGGGATATAA CTGTGCTGAG 

4 6401 GG AGAAAAAT GTTATGAGAG CGCAAATCAA AGATTAGACA ACCTATTTAT 

4 6451 TGCTTTTTCT TCTTCTGTTC CTGCTATGAA GCGGCTCTTT GACAAATATG 

4 6501 GTTCTGTGGT ACGGGTAGAT CGTAGGCAGA TTCGTGAGCA GATTCTTTCG 

4 6551 AACACTGAAA TCTTAGAAAA TGAGTCAGGG TTCCTCTGCA GTTTGTATGA 

4 6601 ATATCCTTTA TCCTATTTGA TAGATTGGGC TGTTTTGCTA GACTGTGTTC 

4 6651 GCGGTACCGA AATCTCTCTA GAAGATCAGG CCGATTACAC CGTTTGTTTG 

4 6701 CAAGGCTTGG ATTCTATGTT ATCTCAATTT GCGAGTCGTT TACAGTCTGG 

4 6751 ACAAAAAGTA TTGAATCCTA GAGATGTTTT AAGTGAACAG GCTGCGGTTA 

4 6801 TGCTTGTTCA TGGCTTGGCA GCACAGGGCG TGTCGTTTCA AGGATTGAAA 

4 6851 GCTTTGATGT ATTTGACAGC CGTTCCCCAA AGAATGTGGT TAGGAGCATT 

4 6901 GCCTTTATTT GAATCTTTTC CTGTCTTTAA TCGGATGAAA GAATTTCTTG 

46951 GGGAATCTCT GGGAGACTAG GTGAATTTGT ATCAAAGAAG GAACAAGATT 

47 001 GCATGTTAGG TTCTTTGCCA TGTTATCCTG GTGCTGGCAA TATTGAAGAA 

47 051 TACAAAAATA GGTATTTCTA TTGTCAGTTA TGTGCTGAGG TCGTTAGTCC 

47101 CTATGTTGTT CCTGTTATTG TAGTTGATGT GCAAGGGGCT CCTCCTACAG 

4 7151 GTATCTTGCA GGTCTTGCGT TGTAAGCAAC ATAAATTTCA AGGCCTACCC 
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4 7201 GTACATGGCC CCATTACTTC TTTATGGGCT TTGGAGCCCG TGGGTAAGGG 

47251 AGCTCCGCAG CTGGAGTCTG CAATGTACGA GCTCTGTTCT CAAGTAAGGA 

47301 ATTTTGACAT CTGCTCTATT GTGAGTTGGG TCTTTGGTGG GTTGTGTATT 

47351 TTTGCAGGTC TGATTGTCGG GGTAATGGTT GAAGCCCCTT TGATTGCGGG 

4 7401 ATTAAGTGCT TGGGTGATTC CCTGTATCAT TGGAGGGGTT GGTGCCATTT 

47451 TATGCTTGTT TGCGATCTTG ATGGCGTACT TGGGAAGAGG GAGAGTCCGT 

47 501 GAGTGGCTCA ATCTTTCACA CGAATATATA ACGCAATGTC ATTGTCGTCA 

47 551 GATACAGGCA CATTCTCAAA ACTATTCTGT GATCACAGAG TATCCTGCAA 

47 601 CCTGTGCATT ATCTCAACCG ATTACAAAGT TACCTAATGG ATCACGCAGA 

47 651 GATAACTAAG CGTGTTCGTC AGTTATTTCT CACATTTTCT CATGAATCTT 

4 77 01 TTACTGCGCT GCACGAGATC CCTCTCGAAA ATTTTTAAGG ATAGATACTT 

477 51 GGAAACTATG GTTTAAAAAG CTATAGAGGA TTCTAAATTG GGGTTCTAGC 

47 801 AACTTCTTGA CTTTAAGATC CAAAGTTAAG AGACTGACTA ATTATTTTTG 

47 851 TTTGCTTGTG TTTCCAGATG AGCAATTGGT ATGGTAAGAG ATATTCAGAG 

47 901 TGAATCTATA GGGAAATTAG TATTTTTAGG CACAGGAAAT CCCGAAGGAA 

47 951 TTCCCGTGCC GTTTTGCTCA TGTAGAGTGT GTCAAAACAC AGGGATTCAT 

48001 CGTTTACGAT CTTCGGTACT CATTCAATAT CAAAACAAGA CTCTAGTGAT 

48051 TGACGCAGGC CCTGATTTTC GTACGCAGAT GTTAGTTGCA GGGGTTTCCG 

4 8101 AGCTCGATGG GGTATTTCTG ACCCATCCCC ACTACGATCA TATCGGTGGT 

48151 ATTGATGATT TACGTGCGTG GTACATAGTC ACGCAGCGTT CGTTGCCTTT 

48201 GGTCCTTTCT GCAAGCACCT ATAGATTTTT AAACAAGGCT AAAGAGTATC 

48251 TCTTCGCCAC TCCGAATGTA GAGTCTTCAC TTCCCGCAGT TTTAGAGTTT 

48301 ACAATCTTGA ATGAGGACTG TGGGCAGGAG GAATTTCAGG GCATTCCCTA 

48351 TACTTATGTT TCCTATTATC ^ AAAAGTCGTG CCATGTAACG GGTTTTCGTT 

48401 TTGGAAATCT TGCTTATCTT ACAGATCTCT GTAGCTATGA TGCAAAAATT 

48451 TTCAGTTACT TAGATAATGT AGAGACATTG ATCTTGTCTG CGGGTCCATC 

4 8501 GGAAACTCCT ATTCCTTTTC AGGGACACAA ATCTTCGCAT CTTACTGTAG 
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4 8551 AAGAAGCCAA AGCTTTTGCG AATCATGCAG GGATAAAGAA TTTAATTATT 

48601 ACACATATCA GCCACTGTTT AGAAGCAGAG CGTGACCAGC ATCCAGAGGT 

48651 CACATTTGCT TATGATGGCA TGGAGGTCCT TTGGACACTA TAGATACGCC 

4 8701 CGGGGAACAG GGTTCTCAAT CTTTCGGAAA TTCGTTAGGG GCCAGGTTCG 

4 87 51 ACTTGCCTCG TAAGGAACAG GATCCCTCTC AAGCTTTAGC TGTGGCTTCC 

48801 TATCAAAATA AGACAGATTC TCAGGTCGTT GAAGAACATT TAGACGAGTT 

4 8851 GATCTCACTT GCGGATTCCT GTGGTATTTC TGTTTTAGAG ACCCGTTCTT 

4 8901 GGATTTTAAA AACACCCTCA GCTTCCACCT ATATCAATGT GGGGAAGTTG 

4 8951 GAGGAGATCG AAGAAATCTT GAAAGAGTTT CCCTCTATAG GGACTTTGAT 

4 9001 CATAGATGAG GAGATCACTC CATCCCAACA ACGGAATTTA GAGAAACGCC 

4 9051 TTGGCCTTGT CGTTTTGGAT AGGACGGAGT TAATTTTGGA AATCTTTTCC 

49101 AGCCGTGCCC TTACTGCAGA GGCAAATATC CAAGTCCAAC TTGCACAAGC 

4 9151 ACGTTATCTC CTTCCTCGTC TTAAGAGACT TTGGGGGCAC CTATCTCGGC 

49201 AAAAATCTGG GGGAGGTAGC GGAGGCTTTG TTAAGGGGGA AGGAGAAAAA 

4 9251 CAGATCGAGC TAGACCGTAG AATGGTCCGT GAGCGTATCC ATAAGCTGTC 

493 01 AGCACAGCTG AAAGCTGTGA TCAAACAGCG TGCGGAACGC CGTAAAGTAA 

493 51 AATCTCGACG AGGAATTCCT ACCTTTGCTT TGATAGGGTA TACAAATTCA 

4 9401 GGGAAGAGCA CCCTATTAAA TTTGCTGACG GCTGCTGATA CGTATGTTGA 

4 94 51 AGACAAGCTA TTTGCAACTT TAGATCCCAA AACGCGCAAA TGCGTACTTC 

4 9501 CAGGAGGCCG TCATGTCCTT CTTACTGATA CTGTAGGCTT CATTCGAAAA 

49 551 CTTCCTCATA CTTTGGTAGC AGCATTTAAA AGTACTTTAG AAGCAGCTTT 

49601 CCATGAAGAT GTTCTTCTGC ATGTTGTCGA TGCTTCGCAT CCTTTAGCTT 

4 9651 TAGAGCATGT ACAGACGACC TACGATCTCT TTCAAGAGTT GAAGATTGAA 

4 9701 AAGCCTAGGA TCATTACTGT GTTGAATAAG GTAGATCGGC TTCCTCAAGG 

4 9751 AAGTATCCCT ATGAAATTAC GTTTGCTCTC TCCTCTTCCT GTATTGATTT 

4 9801 CAGCAAAAAC TGGGGAGGGG ATCCAGAATC TTCTTAGTCT TATGACGGAA 

4 9851 ATCATTCAGG AGAAAAGTTT GCATGTGACT TTGAATTTTC CTTATACAGA 
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4 9901 ATATGGAAAA TTTACGGAAC TTTGCGATGC CGGGGTTGTG GCCTCGTCAA 

4 9951 GGTATCAAGA AGATTTTTTA GTTGTTGAAG CGTATCTTCC TAAGGAGCTG 

50001 CAAAAGAAAT TTCGTCCTTT TATTTCTTAT GTTTTCCCTG AAGATTGTGG 

50051 AGATGACGAG GGTAGAGGGC CCGTCTTGGA GAGTTCTTTC GGGGATTAGG 

50101 TAGTTTTCTT CTAGGACATC GAATCTTTGT TAGTGAGAAA AAGAGTGATA 

50151 TTTTAAAATA GCCACTCATC GCTAAATCTA TTGAAGTCTC TAGAGGTATA 

50201 TGACGGTTGC GGAAGTCAAA GGAACATTTA AGCTGGTCTG TTTAGGCTGT 

50251 CGGGTGAATC AGTATGAGGT CCAAGCATAT CGCGACCAGT TGACTATCTT 

50301 AGGTTACCAA GAGGTCCTGG ATTCTGAAAT CCCTGCAGAT TTATGCATAA 

50351 TCAATACGTG TGCTGTCACA GCTTCTGCTG AGAGTTCGGG TCGTCATGCT 

50401 GTGCGTCAGT TATGTCGTCA GAACCCTACA GCACATATTG TTGTCACAGG 

50451 TTGTTTGGGG GAATCTGACA AAGAGTTTTT TGCTTCTTTG GATCGGCAAT 

50501 GCACACTTGT TTCCAATAAA GAAAAATCCC GACTTATAGA AAAAATTTTT 

50551 TCCTATGATA CGACCTTCCC TGAGTTCAAG ATCCATAGTT TTGAGGGAAA 

50601 GTCTCGAGCT TTTATTAAAG TTCAAGATGG CTGTAATTCT TTTTGCTCGT 

50651 ACTGCATTAT TCCTTATTTG CGGGGGCGTT CGGTTTCTCG TCCTGCTGAG 

507 01 AAGATTTTAG CTGAAATCGC AGGGGTTGTA GACCAAGGAT ATCGCGAAGT 

50751 TGTAATTGCA GGAATTAATG TTGGAGATTA TTGCGATGGA GAGCGTTCAT 

50801 TAGCCTCTTT GATTGAACAG GTGGACCGGA TTCCTGGAAT TGAGAGGATT 

50851 CGAATTTCCT CTATAGATCC TGATGATATC ACTGAAGATC TGCACCGTGC 

50901 CATCACCTCA TCGCGTCACA CTTGTCCTTC GTCACACCTT GTTCTTCAAT 

50951 CGGGGTCGAA TTCAATTTTA AAGAGAATGA ACCGGAAGTA TTCTCGCGGA 

51001 GATTTTTTAG ATTGTGTAGA G.AAGTTCCGT GCTTCTGATC CTCGCTATGC 

51051 CTTTACTACA GATGTGATTG TCGGATTTCC TGGAGAGAGT GATCAAGATT 

51101 TTGAAGATAC TTTGAGAATT ATTGAAGATG TAGGCTTTAT TAAAGTGCAT 

51151 AGTTTCCCTT TCAGTGCTCG TCGTCGTACT AAGGCATATA CTTTTGATAA 

51201 TCAGATTCCC AATCAGGTGA TCTATGAGAG GAAGAAGTAT CTTGCTGAGG 
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51251 


TTGCTAAGAG 


GGTAGGCCAG 


A A A A A T*/"* A 

AAAU Avj A i OA 


ibAAbbbI i 1 


APPAPAPACT 


51301 


ACAGAGGTGC 


TTG n bAb AA 


AU 1 AAuUvj^jLs 


P APPTTPPTA 


PPPGTPACTC 


513 51 


TCCTTAl ill 


CjAAAAooi 1 1 






GTAGCTATCA 


51401 


ACACTCTAU 1 




\^ 1 i or\ i rVOOVj 


TAGAGGAAGA 


AGGGCTGATT 
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GGGG AGAT TG 


X A 1 Cj A i ALj A 1 


A i A/i i VjV^/lrtV^ 


ATTTTAAGCr 


GTATACTATG 


51501 


GTCCCAGGAC 


AAAAACTCCC 


1 Ai 1 UL. i. (jLjA 




ATGPTPAGGT 
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PA A ArAGGAA 


ATCTTAAATG 

c\ X \^ X X njijit X \j 


51601 
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A/^AO^rn/^OA A 
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PPPPPTTTa A 
VjjVjoL-L. 1 i I /Vrt 


A APHPTTTPP 


TGTTTTGGAA 

X O X X X X 
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GATTTACATC 


GTGGGGGGCr 


•PPP a PTP A PT 


TPTf^AGPfiPT 


ACAAGTATTA 
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TCGGGAGAGT 


r'r^Apapa aTP 
(jrLAL.Av-AA i u 


TATP A A APPP 


AAAPTGrPTT 


51751 


CGGCAGCGCA 


AGCAGGGCCC 






TAAGPATGCA 


51801 


GATTGGCAAA 


AGGTCCGTTG 


TLG i Lb 1 bAi 


PT"P A A AP A A A 


TTPTTPPPPT 
X X V- X X V^Vi^W \- X 


51851 


ATGGTTCCGT 


TTCGCCGCTA 


1 bbb 1 bb 1 AA 


PPP A TPPT AT 
bbbrt I LU 1 Ml 


PPPP ATPT AG 


51901 


AGACGACGGC 


TATCGGTAGC 


(Ttrpoorri A A aP a 
1 i bb 1 AAAbA 


PTPPPP ATPA 


AAGAGTTTTA 

rVf^OrtVJ X X X X r\ 


51951 


CATAGGGAAA 


/^m A A f~* A O A T* 

CTACAGAGAT 


T'PPT'PPT'PPP 
I bb 1 bb 1 bbb 


TTAPTPTPP A 


TAGCCCTTGC 


52001 


GGGATTTTCA 


GAGTGCTTTC 


rprp^j^rn A OO a P 

i 1 bb i AbbAb 


PT ATP ATGAA 


GAGTTCCAAG 


52051 


GAATCCTCCC 


/*«^ A A /"» A mo O A 

CCAAGATGGA 


o A T*/^/" a P 5. PP 
bAi bbAbAbb 


PPPP APTTPP 


TTTTGAGCTT 

X X X X \J{^\J\.r X X 


52101 


CTCTCGTATA 


GCTTTGGTAT 


p a Tpp a ap AT* 

bA i bb AAbA i 


ATTTTTPTPA 
e\ X X X X X\^ i. vjri 


GACACCAGGG 


52151 


ACAGCTAGTA 


OA O A rp OOT'T'O 

GAGATCCTT L 


PT'P P A TT A PP 
blbbAl irtb*- 


TPPTPAATTT 


CCTTGTGGCC 


52201 


GCTTGATTCA 


TGTTGCCCTT 


bb i AAi b i 1 b 


PP APTTTPTP 
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TATCGTCTGG 


52251 


ACTAAGAAAA 


CTATCCGTCA 


GGTCGAGCTC 


CATGCAGAAT 


ATAGTGGCGA 


52301 


GGTATTTTTA 


AAGTTTTGTT 


CTTCACTATG 


CAGTGCGCGC 


CTTCGGGAAT 


52351 


GGTCGGAGCG *ACGTCTCTCT GGATCTAAGA GACTTTCTTT 


AGGAGAAACT 


524 01 


CTGGAGATAA 


AAGCAGGAAC 


CACATATTTA 


TGGGATTGTT 


TTCATAAATA 


52451 


GATAGCCTTC 


CATGGTTGAT 


AAACTGATCC 


ATCCTTGGGA 


TCTTGATCTG 


52501 


CTCGTCTCAG 


• GACGACAGAA 


AGATCCCCAT 


AAACTCTTAG 


GGATCCTTGC 


52551 


TTCTGAAGAT 


TCTTCAGATC 


ATATTGTTAT 


TTTTCGTCCA 


GGGGCGCATA 
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52601 CGGTTGCTAT TGAACTTCTA GGAGAGCTTC ACCACGCTGT AGCTTATCGT 

52 651 TCGGGGCTCT TTTTCTTATC CGTTCCCAAA GG AATCGGAC ACGGGGATTA 

527 01 CCGTGTGTAT CATCAGAATG GACTTCTCGC TCATGATCCC TATGCGTTTC 

52751 CTCCTCTGTG GGGAGAAATT GATTCTTTTT TATTCCATAG AGGAACGCAT 

52801 TACCGCATTT ATGAACGCAT GGGGGCAATC CCTATGGAAG TTCAAGGAAT 

52851 CTCAGGGGTG CTCTTTGTTC TTTGGGCTCC CCATGCGCAG AGAGTCTCTG 

52 901 TAGTCGGAGA TTTTAATTTT TGGCATGGCC TTGTCAATCC TCTACGTAAA 

529 51 ATTTCCGATC AGGGGATCTG GGAGCTTTTC GTCCCAGGCT TGGGAGAGGG 

530 01 AATACGGTAT AAGTGGGAAA TCGTTACCCA ATCGGGGAAT GTGATTGTAA 

53 051 AAACAGATCC TTATGGGAAG AGCTTTGATC CTCCACCCCA GGGTACAGCT 
53101 CGTGTTGCGG ATTCTGAGAG CTACTCTTGG AGTGATCATC GTTGGATGGA 
53151 GAGGCGCTCG AAGCAGAGTG AAGGGCCCGT CACGATCTAT GAAGTGCACT 

532 01 TAGGCTCTTG GCAATGGCAG GAGGGAAGGC CCTTAAGCTA CAGCGAAATG 
53251 GCGCATCGCC TTGCTAGCTA TTGCAAGGAA ATGCACTACA CTCATGTGGA 

533 01 GCTTCTTCCC ATTACGGAGC ATCCCCTGAA TGAATCTTGG GGCTATCAAG 

533 51 TGACGGGATA TTATGCTCCA ACATCAAGAT ACGGGACTCT CCAGGAGTTT 

534 01 CAGTATTTTG TAGACTATCT ACATAAAGAA AATATTGGTA TTATTTTAGA 
534 51 TTGGGTGCCG GGACATTTTC CCGTAGATGC GTTTGCTCTT GCCTCTTTTG 
5 3 501 ATGGGGAGCC TCTCTACGAG TACACGGGGC ATAGTCAGGC TCTTCATCCC 
53 551 CACTGGAATA CGTTTACCTT TGACTACAGT CGTCATGAAG TGACCAACTT 
53 601 TTTACTAGGG AGTGCTTTAT TTTGGCTCGA TAAGATGCAT ATTGATGGCT 
53 651 TACGTGTGGA TGCTGTGGCC TCTATGCTGT ATCGTGATTA TGGCCGTGAA 
537 01 GATGGAGAAT GGACGCCTAA CATCTATGGA GGTAAGGAGA ACTTAGAGTC 
537 51 TATAGAATTT TTGAAACACT TAAATTCTGT AATTCATAAG GAGTTCTCTG 
53801 GAGTGCTCAC CTTTGCAGAG GAATCCACAG CGTTTCCAGG AGTCACTAAG 
53851 GACGTAGATC AGGGAGGTCT GGGGTTTGAT TACAAATGGA ACTTAGGTTG 
53901 GATGCACGAT ACCTTTCATT ACTTTATGAA GGATCCCATG TATCGTAAAT 
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53951 ACCATCAGAA AGATCTGACA 

54 001 TTTATTCTTC CTCTCTCGCA 

54 051 AGTGAATAAG CTTCCCGGGG 

54101 TGCTCTTGAG CTACCAGATC 

54151 GGTGGGGAAT TCGGACAATA 

54201 TTGGGAGCTT TTGAATCATC 

542 51 CTGCATTGAA TGCGTTGTAT 

543 01 AGCTCTCAAG AGTGCTTCCA 
54 3 51 TGTCATTGCC TATTATAGAT 

544 01 TCTGTGTCCA TCATTTCAGT 
54451 TGTGAAGGTG TAAAGCATTG 
54 501 TTTTGGAGGC TCAGGGAAGG 
54 551 AAGGGGTCGC TTGGGGTTTG 
54 601 ATCTATTTAG TTACTTTTTT 
54 651 GTTGTGGGAT TGTTCTATTT 
54 7 01 TATAATTAAA AATAATTATT 
547 51 AGCAACGATC CCCGATATTT 
54 8 01 TCTTCTTGCA AATAGCCGGA 
54 851 CTAGCAATCC TGAAGATACA 
54901 GTGACTTTAT TTGCTGGCCT 
54 951 TGTTGTCGCT TTGACCGTCT 
55001 TTGGAATCGC CATTTCAGGC 
55051 AGCTTGGTTT ATATGGTCCG 
55101 GAGCAGAATC AAAAGTGCTT 
55151 GTTTGGTCAT GAAGGTGGGG 
55201 CTCGTGGGTA GCTTGGGATC 
55251 ATTAGCAAGC TTCAGTCATT 
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TTTAGCCTTT GGTATGCCTT CCAAGAGTCT 
TGACGAGGTG GTCCACGGTA AGGGCAGCTT 
ATACCTGGAC CCGATTTGCT CAAATGAGAG 
TGTTTGCCTG GGAAAAAGTT ACTGTTCATG 
CGGCGAGTGG TCTCCTGATC GTCCCTTAGA 
ACTACCACAA AACTTTGCGA AACTGTGTCT 
ATTCACCAAC CCTATTTATG GATGCAAGAG 
TTGGGTAGAC TTCCATGATA TAGAAAACAA 
TTGCAGGCAG CAATCGTTCT TCGGCGCTTC 
GCGAGTACTT TTCCTTCCTA TGTTTTAAGG 
TGAACTCCf T CTCAACACTG ATGATGAGTC 
GAAATCGGGC TCCTGTGGTC TGTCAAGACC 
GATATAGAGC TCCCTCCTTT AGCTACTGTG 
CTAAAAATTT AAATACTTTA TTTGTAAATT 
TGTGGTGTAG TTGATATTAA TAATTTATTT 
AGTATTTCTT TTATGTCTAC ATCACCAATT 
GTCTTTGTCT AATGCAACTG AGAAAACTTC 
GTCTCTCGCC AGTACCAAAT TCCCTAGTTC 
GGATTGCGAA AAAGTATTTT CACCCATTCC 
GGTTGTTTTG CTGGTAGCGG TTTCTGTTGT 
TAGCTCCCGG AGTTCCTCAG GCTATTCTTC 
GTGGGTATTG GTGGATTTTC TATAATGAAG 
AGACTATATG TCCCCCAGGA TGCAGGAGTC 
TAGCTGTAGG GACTGGATTT ACTGTCATGG 
GCGAATTTTG TTCCTGGAGG GTATGGGGGT 
CAGTGCGTAT TCCCGGGGAA GCCAAACCAC 
ATATTTATAC TAAGTTTTTC CGTTCTGAAA 
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55301 


AAGTTGCTAA 


AGGGGAGAAG 


CTTACAGAAG 


CAGAAACTAT 


AAAAGhGGCG 


55351 


AAAAAATTAC 


ACTATATCAC 


GTTGTCAATT 


GCCACTATTG 


GCGTTGGTCT 
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TGCGGTTTTG 


GGGATTCTCC 
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AGGAACGGTA 


TTGCTAGGAG 


55451 


GCGCTCCCGC 


AACGATTGCT 


ATTATTTTAG 


CTCCCCCTTT 


AATTTCTATA 


55501 


GGGCTTACGA 


CGGTTTTGCA 


AACGATACTC 


CATAGTAGTA 


TCGGAAAGTG 


55551 


GAGAGCCTTT 


CTGCTTACTC 


AAGAAAAAAA 


AGATCTTTTT 


GTAGACACCT 


55601 


CCCTGAAAGA 


CATTCGCTTA 


GAAAAATTGC 


CCCCCAGTGA 


GGTGGAAGAG 


55651 


agtgaaactt' 


CCCAATCTGT 


GATAGAAGTT 


CCAGATTCAG 


AGGGGATTGC 


55701 


AGAGACGAGG 


ATCTCTGCGG 


AAGAAATCGA 


TACGAGGCTT 


TCCCTGACGA 


55751 


CAAGACAGAA 


GGTCATCTTT' 


GCTCTTGCGA 


CACTCTTGCT 


CTTAGCAAGT 


55801 


ATTGCTGCCT 


TCATAGTCAC 


GGGATTTGGT 


GGATTGACAG 


TCATGCAAGT 


55851 


TCTCCTTGTT 


GCTTCTGTAG 


GATCGGCGGT 


TGCTTCTGTA 


ACACTCCCTA 


55901 


TGGTTTCCTC 


AGGATTTTCC 


TACGTCGCCT 


ACCAACTGAA 


AGCAAGATTG 


55951 


AATATCAGTA 


AATTACGTTG 


GAAAGAAGCA 


AAAAATAAAA 


AGCGGGTGCG 


56001 


CCAGTTCTTA 


ATTGAGTCTG 


GAGTGATTGC 


CTCGGATCGA 


GAATTTAACC 


56051 


AAATGTGGAA 


GACAGTCTAC 


AAAAAACAGA 


TTCAGAAGAC 


TGACGCTGCA 


56101 


ATTCGTGAAG 


AGGTTCGCAA 


TTTTGAGAAG 


GGTGGGGAAG 


TGAACAGCGC 


56151 


CCTTGTTGGT 


GGAATCTTAC 


TTGGTGTAGG 


AACTGGGATC 


ATGCTTCTTG 


56201 


CCCTGGTCCC 


TGCATTTGCT 


CCTATCGTTC 


CTGGTATTCT 


TGCTCTTGGA 


56251 


GGATCGACGT 


TAGGAATCGC 


GGGATCGATT 


TTAATGAGGA 


AGTTTGTCAA 


56301 


CTGGCTCTAT 


GATGAGCTTG 


TGAAGCTCTA 


TGAGCGTCGA 


CGTAATCGCC 


56351 


GTGAGCTTCT 


CTATGGTCCT 


GAAAGTAAAA 


TGCGCTCCAT 


TGCTACGGAT 


56401 


TTAGTTGTTG 


AGGCTCTTGC 


TGCTAGCCAC 


GATCATCTAT 


TTGATCTTGA 


56451 


TGGTCCCGTA 


GATTTTATTG 


ATGTGGATGT 


AGATATAGAT 


GGAGCTGCTT 


56501 


AGGCCAGGTC 


CTTGAATGTA 


AGAl UUl CtjA 


UL. ill \j\j\3r\\D 




56551 


TTCTCTAATT 


TTATTTTCTC 


TTTTACATCT 


AGTAGTTTAC 


TTAATTATAT 


56601 


AATCAGCATT 


CTTTTTGTTT 


ATTTTAATTT 


ATATTTTGTT 


TTTAAAATAT 
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56651 TTTTTATTTT AACATTTGTT TAATAGTTTT TATTAAATAA TTTATCATTT 

56701 TAAGGTTCAA TTATGGCAGT TGGTGGCGTA GGCGGCTCAA GATCTCCTTC 

56751 CCCCATTCCT CCTAATAGAA GGAATAGTGA GGATGGAAAA GTAAGTCCTA 

56801 AAGACAACTT AGGGGAACAT ACAGTTAGCA GTAGTGACAG TAGTCTTGCA 

56851 AGTCAGGGCC CTACAATAGA AGAGAGAAAA GCCCAGTTAG GCGGGACTGA 

56901 TAAAATTCGT TTGCCATCTG TCAAAGAACC CGGAGATTCT CAAACTTCAG 

56951 GACGTTCTGG GGTACTTCAG AGAATTTGGA AAGGCGTTAA AGGGGTCTTT 

57001 AAAAAAACCC CTCAAGCGCG TCCTGAAGTT TCTAGTCCAC GTCTTCCATC 

57 051 CCATGTGCAA CATGGCCAAC GTCTTCCTGG ACTCG AGGGC TTTAGAGATC 

57101 GTATCCAGAA AAGATCTGAA AATCCAGAGG CAGATTTAGG GAAGATGAAA 

57151 CGTTCCTATT CTGATGGTGA' CCTTGATCGA GTAGGACACG ATTCTAATGA 

572 01 AGATTCTACA GAGGATAGCC GTTCTGAAGG AGGAGAGCCT TCTTCAAAGA 
57251 GTTCTTCCTT CTTATCAGGA GTTCGAGGAG CGGTGTCTAA AGTTCATGGT 

'57301 GCCCTAGGTG ATATTAAAGG AAAGTTCCAG CGTTCTGCTT CCGAAGATGA 

573 51 TTTAACAACT CAGGGCGAAG ATTCTGCCGG CGATACTGTA AAAGAAAGGC 
57 4 01 GTTCCGAAGA AGCAGAGGCT TCTTCGAAGA GTTCTTCTTT TTTATCAGGA 

574 51 GTTCGAGGAG CGACGTCTAC AGTTCAGGGA GCCTTAGGTG ACGCTAAAGA 
57 501 GAAGGTTTCG GCGTTCGGAG AGCAGGCTGC AGGTGCAATC AGATCAGCAC 
57 551 CAGGGAATAT CAGAACTAGA TTCCAACGTT CTTCATCGGA AGGTGATCTT 
57 601 TCTAATGTGA ATAAAGCAGC AAAACATCTG CGTAAGGCTT TAGAAAATTT 
57 651 GGAAAAAGTA GCTCCAGAAC AAGTGTCACC AGAGGTGGCT TCTAGGGTGC 
57701 AATCTCTTCT TGCACGCATG GAGCAATTGA CTCATCAGGA ACCTCCTACT 
57751 GTGGAGGATC TTATTACTTT CGTAGAATCC AATGTAGGTA GTGATTCTGT 
57 801 GGAGTATGCA TCCATCGTAC CTCAAGATGG ATCGCAAGCC CCAGCAGAGA 
57 851 CTGCGGAAGC TCCCGAAACA GGTGGGGTAG AGGGATCTGC AGCGCAGGGA 
57 901 GC ATGGAAAG CGTTACGGGA TTTTGTAGTT AGCATATTCC AAGCGGTAGC 
57 951 GAGCTTCTTT AGGGCAATTG CTTCAAGATT AAGTTCAGCA CGACGTGAAT 
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58001 CAGCTGTAGA TGATCTTGCA TCAGAAAGTA ATACACAATG GTTTGTGGAG 

58051 CAAGAGGGCG TTTCAAATCC ATCGGCTGCA CCTAGCTTAT CTTTTGCGGA 

58101 AGAGATCGCT CGTAGAGCTG CAGAAATGAG TAACAGAAAT GCCCAGAGTC 

58151 TTGAAAAATT GGAATCAGGC AATGTGACTG ATCCTGTCAT TCAACAAGGC 

58201 TTAGGATTAG CTAGATCATT TGCTCCAGAG GGACAGTAGT CGTTATCTCA 

58251 CTGTTTCTCT ATGCGCAAGG GAAACTTGAA GAGTTTTAAT TAAAACTCTT 

58301 CAATATGTTG ATTATTTTAA TATATTTAAA AGCATTTTTG TTGTTTTTTA 

58351 ATAAAATTAA ATTGTTTCAG AAAAAAGATT ATTCTTTTTA GGAAGTGTTT 

58401 ATGGCATCAG GAATCGGAGG ATCTAGTGGA TTAGGAAAGA TTCCACCTAA 

58451 AGATAATGGG GATAGAAGTC GATCGCCCTC TCCTAAGGGA GAACTTGGCA 

58501 GCCACGAGAT TTCCCTGCCT CCTCAAGAAC ATGGAGAGGA AGGAGCTTCA 

58551 GGATCTTCGC ATATACATAG CAGTTCCTCT TTTCTACCAG AAGATCAGGA 

58601 GTCTCAGAGC TCTTCTTCGG CAGCTTCTAG CCCGGGATTT TTTTCTCGCG 

58651 TACGTTCTGG GGTAGACAGG GCCTTAAAAT CATTTGGCAA CTTTTTTTCC 

58701 GCAGAGTCTA CGAGTCAAGC GCGTGAAACG CGACAAGCTT TTGTTAGATT 

58751 ATCAAAAACC ATC ACCGCGG ATGAGAGACG GGATGTCGAT TCATCAAGTG 

58801 CTGCTGCTAC AGAAGCCCGA GTGGCAGAGG ACGCGAGTGT TTCAGGCGAA 

58851 AATCCTTCTC AGGGGGTTCC AGAAACCTCT TCTGGACCAG AACCTCAGCG 

58901 TTTATTTTCT CTTCCTTCAG TAAAAAAACA GAGCGGTTTG GGTCGGTTGG 

58951 TACAGACAGT TCGCGATCGC ATAGTACTTC CTAGTGGGGC TCCACCTACA 

59001 GACAGCGAGC CTTTAAGTCT CTACGAGCTA AACCTCCGTT TGAGTAGTTT 

59051 ACGTCAGGAG CTCTCTGACA TACAAAGTAA TGATCAGTTG ACTCCAGAGG 

59101 AAAAAGCAGA AGCCACAGTT ACCATACAAC AGCTGATCCA AATTACAGAA 

59151 TTCCAATGCG GCTATATGGA GGCAACAGAA TCTTCGGTAT CTCTAGCAGA 

59201 AGCTCGTTTT AAGGGGGTAG AAACTAGTGA TGAGATCAAT TCCCTCTGTT 

59251 CAGAACTGAC AGATCCTGAG CTTCAAGAAC TCATGAGTGA TGGAGACTCT 

59301 CTTCAAAACC TATTAGATGA GACTGCCGAC GATTTAGAAG CTGCTTTGTC 
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59351 CCATACTCGA TTGAGTTTTT 

59401 ATAATCCAAC TCTGATTTCT 

59451 GGAGCTGCAG ATCCTCAAAG 

59501 GAATCAGATT CGCGAGGCTC 

59551 TTCTAGGGTC CATCTTGCAC 

59601 GAAGCAGTGG GTCGTTGTTG 

59651 TGAAGAGGAC TCGATGTCGG 

59701 AAAGAACGGG CTCTCCGCAT 

59751 GAAGATTCTC CATTGATGAA 

59801 TGCTAAAACC AAGGAGAGTT 

59851 CTGCTCCCAT AGTGAGAGGT 

599 01 GTTATGGAAG ATGATCATAT 

59951 AATCTATGAC GTTCCTAGTT 

60001 AAGAGGATGT TTTTGGAGAT 

60051 TCTAAAGACA AGAACATCTA 

60101 CTATGATCTT CCTTCACGTC 

60151 CTTCAGATCG CGTACGAAGC 

60201 CCTCCAGTTC CTTCACCTGC 

60251 TATGAGCGGT GCTTCAGGTG 

603 01 GTTCCCCCTC TCCTAGAGGC 
60351 CCTGAAGATA ATCCATTTAC 
60401 GAGGTCAGGC GGTGCTTCCG 

604 51 TCCCATGGAT TCATGGCAGG 
60501 ACATTGACTA ATGTTTCGCT 
60551 AAGAGCCGCT TTGCTTAGCG 
60601 AGAGTATTGT TCCTCCAACA 
60651 GAGCCCTTAG GGGGACTTGT 



CTTTAGACGA TAATCCAACT CCGATAGACA 
CAAGAAGAGC CTATTTATGA GGAAATCGGA 
AACTCGGGAA AACTGGTCTA CAAGATTATG 
TGGTTTCTCT TTTAGGAATG ATTTTAAGCA 
AGGTTGCGTA TTGCTCGTCA TGCAGCTGCT 
CACGTGCCGA GGAGAAGAGT GTACTTCTTC 
TGGGGTCTCC TTCAGAAATT GATGAAACTG 
GACGTTCCAC GCAGAAATGG AAGTCCACGT 
TGCCTTAGTA GGATGGGCAC ATAAGCACGG 
CAGAATCAAG TACCCCGGAA ATTTCGATTT 
TGGAGTCAAG ACAGTTCCGT CAGTTTTATT 
TTTCTATGAT GTTCCTCGTA GAAAAGATGG 
CCCCTAGATG GAGTCCTGCG CGAGAGTTGG 
TATGAAGTTC ' CTATAACCTC TGCTGAACCA 
CATGACACCT AGATTAGCAA CTCCTGCTAT 
CAGGATCGTC TGGAAGCTCA CGTTCTCCGT 
AGCTCACCAA ATAGACGGGG TGTGCCTCTT 
TATGAGTGAG GAGGGGAGCA TTTATGAGGA 
CAGGTGAAAG TGATTATGAA GATATGAGCC 
GACTTGGATG AACCCATATA TGCTAATACT 
TCAGAGAAAT ATAGATAGAA TTTTACAGGA 
CTTCTCCTGT AGAGCCTATT TATGATGAGA 
CCCCCTGCTA CACTTCCAAG ACCCGAGAAT 
TAGAGTGAGC CCAGGGTTTG GACCAGAAGT 
AGAGCGTGAG TGCTGTTATG GTCGAAGCAG 
GAGCCGGGGG ACGGAGAATC AGAATATCTA 
AGCTACAACG AAAATCTTAC TACAAAAAGG 
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60701 ATGGCCTCGT GGAGAGTCGA 

60751 CTCTAGTGAG GATGTATCGG 

60801 GTGGATTTCC CTTAGCTTCT 

60851 CACTACTCTC TATTTGGTCT 

60901 TTTAAAACTT AAAGGTCGTT 

60951 GAGGATACCG AGCCTTGATA 

61001 AAAAATTTGG TTTCTTCCTT 

61051 TTCCTGATAG GAGCCATGAC 

61101 TCAGGATTTT GTTGGTAGAG 

61151 ATAGGTAGGA AGACGGAACT 

61201 GGACTGTCAG GTTATATAAG 

612 51 TCTCCACTTT CTTGAAACGC 

613 01 GATAAAAGGA GTGAAGTGGA 
61351 AGCAAGAGAG AGCCGCCCCT 
61401 TGTATTTTTC CAGAATATCC 

614 51 GTAGGCTAGA GATGCAGATG 
61501 TGATCTGGAG CGCTACAGTT 
61551 TGGCTCTCGT AGAGATTACT 
61601 ATGATTAGAA GCAGTGTTTG 
61651 CCATATGGAA TCCTTTCGCA 
617 01 CGAGTAGCTT CTCCAGAAGC 
617 51 ATAACGTAGT GTCGCAAATA 
61801 GGGCAATGTC TCCTTTGTTT 
61851 CAATCACCAT AAAGCTGGCG 
61901 GGTATCAGAA GXTGTGATTG 
61951 GGTAGCCGTA GTGTTGAGTT 
62001 GTATTAATAT GTTTAGCTGT 
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ATGCTTAGGA TTTAAGTAGT TCTTTCGAAT 
GTTCTTAATT TTTATGGGGG AAACGTATCT 
CCCATAAGAT TCATGATGGT AGAGAGTGTT 
TTACAGGTTG CATTGTCTAT ATAACATGCT 
CCTGCGTGAA GGTAATGTGT CGTCGTTGAT 
GTCTAAGAAC ACCGAAAGTT TAGGGAAGAT 
TAAAAGCAAT GGCATTGCGA GCAAGGGTGG 
GATCCACTAG ATTCTAGACT CACGTTGATC 
GACAGGCTGA TAAGCAAGCT CTATGTTCCA 
TGGATTCCCA AGCGCTCTGA ATTCCCAGAG 
GGTTTATGAA CAGAAAATTT TCTAGCTTTA 
AGTTTGATTA GAACGAACGG CAATTGCTTG 
GAGGTCGTGA TCGCCATTGT AGAGATAGAG 
AATGTCGTAC TATAACATTT GCCTTCCGTT 
AGATGCTTTG ATATGGTGGT TGCTGTAGCT 
TAGAGAATCT CTCTTGCAGC CAAGGATTAT 
GTCGTATGCG AAGCCACGGA ATTGTCGGAG 
GAAAAGTTGG GAGAAGTTTA CACCAAAGCT 
AGGTTGTTCC CAAAGAATAA CCCGTAGCTT 
TCATTGTTGC TATTTTGATG CACGAAGAGT 
TGTAGGTGCT ATTTGGCCTT GCTGTGTTTG 
AGTTATGGAA AGATTGCCAG AAGGCAGATA 
TCTGGGTTTA CCTTATATCC TGTAGGTGTC 
ATGTAAAGTA TTCACAGTAT CTTCAGAAGA 
TTTCGATCCA GTAAGGGGAC CAAACGCCTT 
GTATTTAGAC CCTCAGGGTA GAAATTATCC 
GACGTCTAAG AGATACAGAA GAGGAACTTC 
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62051 TGCGATAGGT TGGGCAAGGT CTGCAGTATC ATAGGGATCT AGGTTCTCGT 

62101 CATCCAGTAG GCTCAAAGGT CCTGAGAGAT TGATTATAGG GTTATTATCT 

62151 TCGCTATAGG GTGCTGATGA ACCTGTGGGG CGAATCCATA GCTTGGGAGC 

62201 AACTCTGTTG CCTAAGATAG AGGGAAGGTT AATTGCAAGA TTATTGATGT 

62251 TAATTACAGA ACCCACACTA CTGCTACTTT GTTCTTCGTC TGTTGTAGAA 

62301 AACACAGCTC TACTGCCTAA CCGTAGAGTC CCACCAAATT GATCAAATTT 

623 51 ATAGACTTTC CACTCTGCTC GATCTTCAAG AGCGAGTGTG CCGTTGTACA 

62401 GTCCAATGTG GTTTCTGAAA TGTGAAATGA AGTCATCACG AGAAGTCGAT 

62451 GTATCCGGAA TATATGTTGA GGAGAACAAG ATAGTTCCGA GGTGTTCTGG 

62501 ATTAGGATTA AATTTTTGGA TAGAGTTTTG TATAGTATAT CTTTGTAGTA 

62551 TGGGATCATA GAAGGTAGCA GAATGACCTT GACTTGCTCC AACTGTTAAT 

62601 GAGACATTAC GCGTGCAGTT TACAGAAACA TGATTGCTGA AAGTATCTTT 

62651 GAAGTGTCTA TTATTATAAA AAATAATATC TCCCTGATCA GCAAATAAAG 

62701 TGCATGCACC ATCTTGACGG AGCATGATAG CGCCGCCCCA AGTTCCCTGA 

627 51 TTGTTTGTGA AATAGACGGG ACCACTGTCT TGTATAGTTA GAGATTGTGT 

62801 ACAGATAGCA CCTCCATCTC GTGCTGCAGT ATTATTATCG AAGGCTGCAA 

62851 TTCCTGGGTT GTCTTTTATA GAACAACTAA TGCAATAGAT AGCCCCTCCA 

62901 GAGGAATGGT TAGCAGAGAT GTCCGCTTCC ATGGCAAAAT TATTGTTGAA 

62951 GATCACAGAA CCGGTATTCT TTGTAAGAAT GCACTCTTGA TGTACTCTTA 

63001 TTGCACCACC CAGACCTGAT TGGTTATTCA AAAAATAGAT AGGCTGAGAA 

63051 TTATTCTCAA TTCTACAAGC ATTAGCGAAC AACGCGCCCC CCGCTGTTCC 

63101 GCCTGCAGCA TTATTAAAAA ACAGGCAAGG GCCAGTGTTG TCCTTAATGT 

63151 TTATGATTGC 'AGCTTGGATT GCTCCTCCTG AAGATTTTGC CTTGTTGTTA 

63201 ATGAAGTATG CGGTTCCTTG ATTTTTTGAG ATTGTAACAT TTTTCGAACA 

63251 TAAAACAGCT CCCCCTGTAC AAGTATCAGC GAAATTACTT GCATTAGGAA 

63 3 01 AGCTTAAATT CCCAGAGAAA ATGATGGAAC CATGATTCTC AGAAAGATCG 

63351 AAATTACCAT TCACATACAT CGCACCAGCT CTTTTAATAG CAAAGCTATT 
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634 01 TAGGAAAAGA ATTTGGTTTT TTGTATTCGT TATGGCAAGT GATTTGCAAG 

63451 AGAGAGCACC GCCGTCTTGA GAGAAGTTTT CGAACCAGCT TTCTATGGAA 

63501 TTCTGGTGAT CGAGGACAAT GTCTTGGTTA GTGTCATCCC TAACTCCAAA 

63551 AAGTGTTGCT CTATGAGAGT AGGGAGTCAT GTTAGTAAGA GTATCAATTA 

63601 GAGGGAAGAG TGTTGTGAGT TGATTTGCTT GATTATCAAA ATAGTCAGAC 

63651 AACGGAGTCG CATTAAGGAG TATTGTAGTT TTACCTAAAA TTAAGGCTCC 

63701 AACAAAGAAG GAAGATTTAC TAAGGGATCT GTTATTTTGC ACTGTTTATT 

63751 CCTTGATGTT TTCTTTGTGG TTAAAATGTG CAATGACTCT CAGCTTTAAG 

6 3801 GTATTGACCT ACAGTTGACG AGGAGACTTG AGCTGAATAA TCTAAGGATA 

63 851 AGGTTACTCT TGAGAAAAGT TGGGAAGTAT TTTTTATTTT TGCAGCTACG 
63901 GAATTATAGG AGACGGGAGT TGCTTGTGTT GTCCATGTTC CATTGCTGAT ■ 
63951 GAGTAGTGTC GTGAACATTT CTGGATTTTT TCTGTATAGG GTAGGTACGT 

64 0 01 AGGATATTTC CGTAGTCCAT AGCATGGGGA TATGATGTGA AGTTTTCCAT 
64 051 TCAGAACGGA AGCCTATGGG AGAGGAAAGA TCTGTAAGGG GATGTTTTGG 
64101 ATGGAATTTT CTTATATGGT CTCCAGTTTC TTGGAACGAG GCCTGGGAAC 
64151 AGCGCAGAGC AATGGCACTG ATAAAGGGCT GGAGTTCGAG AGTGCGGGTG 
64201 ATTCTAGCTG GTAAGAATGT GCAGTCTAGA GAGGCTACCA AAGTGTGGTT 
642 51 ATTAAAGAAG GCTTTGGACG ACCCTTTTAA GATTTCTGTA TAGTGGCAAA 
64 3 01 GCATATGGTG ATCTCCGTAG CTATAACCTA GGGATAGCCC TGTAGAGATG 
64 3 51 AAGTCCCTGA AGAGGAGACT GTCGAAGCGG AGTCCTGCAA AGTAGTTGTG 
644 01 GGAGGAAGTC GTACTTGGAG ATTGACGTTC TCTAGTTTTG GAGAACATTT 
644 51 GTGCGAATCC TAAAGAGAAA CTATGTCGTG CTGCAGTTTT TGCTGAGGTT 
64 501 GTTGCTGCAT AGCCCGTAGT ATGGTTTCGG AAGCCTTTGC GTCCCTCGCG 
64551 ATTATGTTGG TTAATTAGAA GCCCGAGTCC TTGCAGAGAG GCTTCAAGGT 
64 601 CATGCTCTTT GAGGTTTTGT GGAGGTAAGA TGCGGATTCC TAACAGAGCG 
64 651 TTATAGGCAG ACTGCCATAA GGTATTAGCA ATAAATTCTC CGTGACGTTC 
647 01 CGGGTTAGGG CGGTATCCTA CAGGAGTCCA GTCTACGTAG AGCTGCCTGT 
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64751 GGTTTGTATT GGTCTGTTCC GGTACTGTAG AGCTTGTTGT AGTCGTAGTT 

64 801 TCCATCCAAT AGGGAGACCA GATTCCCTGA TATCCATAGT GCTCATCTAA 

64 851 GTTCATGGCT TCTACAATGA GATTCGAAGT ATCGATTTTT TTTGCAGTCA 

64901 CATCGAGGAG GTAGAGGAGG GGGGATATCC TTTCGAGGTT CAGAGAGATC 

64 951 TAAGCTATCA TAGGGGTTTT CATTTTCATC GTTTAGAAAA GTCAAGGGTC 

65001 CTGAGAGAGT GATAGTAGAA GAAGTGTCTT CAGAATAGGT GGATCCTGTT 

65051 AATGTAGGAT AAATCCAGAA CTTTGGAGCT GAGGCTTCTG ATTGTAAAAT 

6 5101 AGAAGGAAGA TTGATCGCGA TTGCATTAAA ATTTATGGAG CTTCCCGGGC 

65151 CTTTCGTCCT GATTAATGCT GCGTTTCCTA AACGTAGAAT GCCCCCAGTT 

65201 TGCGATAGGG TTTTGCAAGA AATAGCAGCC CGATCTTCAA TAGCGAGCAC 

652 51 ACCCCTTTCA AGTCGTGAAG AGTTAGAAAA TTTTGATAGG AAGTTCAATG 

65301 GATTTGTTGC GTTAGAATCT ACATTGATTC CGGAAAACAA CACGGTGCCA 

65351 AGGTGATGGG GTTCATAATT AAATACTATA GGATCTGTTG TCGTCTGATC 

65401 GTGATCTATA GG ATCATAAA AGAGAATTTT ATAACCCTGT CTTGCTCCTA 

654 51 GTTTTAAGTT AATCCCCGGA GCAGCATAGA GTGCATTTCT ATATCCGGGT 

65501 TGAGGAGAAG AAGATGTGAT TGTATTATTG TTAAATAGAA TATCGCCGTA 

65551 GTCTGCAGAG AGGAAGAAAT TTTGAGGAGT ACTTCCTATA CCAGAAAGAT 

65601 TGATGAGAGC CCCTCCTGAA GTCGCAGAGT TATTAATAAA TGCTGTCGGA 

65651 CCGTTATTTT GGAAGATGAA AGATCTCGTG TGTATAGCTC CGCCGCTAAG 

657 01 TGCTGCCGTT TTATTGTTAA AGATAAGACC TTTGGGATTG TTCTCAATGA 

657 51 CTAAGGAGGT ACACATGATA CCGCCACCAC CGGGATATAG TTTTCCTGAT 

65801 GCTGTGTTAA TTGAGGATGC GGAGTGATTG CTGATCTCTA TAATTTCTTT 

65851 ATTTGAAGAG ATAATGACTC CTAGGGCAGA AAATATACCA CCCCTTGTCC 

65901 TGAAGCATTA TCATTGATTT GGATGTTTTG ATAATTCCTT TCTATATTTA 

65951 CGGCTCTGCA GAAGATGCCT CCTCCAAAGC TGGAATCTTC TAATGTTTGA 

66001 TTTTTCTTGA TAACGATAGG ACCTAAGTTG TCTGTGATTG ATAAACTCAC 

66051 CCCAACATAG ATAGCTCCTC CTCTATTTTT AGAGACATTA TCTAAGAATT 
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66101 GTCCTTCTCC 

66151 CCACCAAATC 

66201 AAACGAGTAG 

66251 GAGCCCCTCC 

66301 TTATTTTCAG 

66351 GTAGTGGTAG 

66401 TGATGTCCTG 

66451 CGAAAGGAGG 

66501 AGACGTGTGG 

66551 GGTTAGGGAC 

66601 ATGCTGAGAC 

66651 GAGTTTGTCA 

66701 ATATTTAAAT 

66751 CATATATTTG 

66801 GATTGGGCTA 

66851 TTTTATCGTT 

66901 GGTGCGTAGA 

66951 TCGAGAGAAC 

67 001 ATTGCGAACA 

67 051 GTAGCGTGAC 

67101 GAAAGTTCTA 

67151 TCCTTGGATT 

67201 CAAACTTTCG 

67251 CGTATTGCAA 

67 301 TTGCCAAGGG 

673 51 ACGTCCCTTC 

674 01 TGGTCCCCAT 



TAAGTTATTT GTAATATAGC 
CTGAAGTTGT AGTAGTAGCT 
TTCTGATTCT TAGAAATCCA 
AGAACTGTGG GAGCTATTCC 
AAATGAACAA ATTTTTACAT 
TTATAATCTA TAACAGAATT 
GGTTATATCG TGACCGAATC 
AAACCAAGGG CTCTGGAGTT 
AAAGCAGAGA TGTCTTTTCT 
TTCATTTCCT GATAAGGAAC 
AAATGGGTCG CATAAACATC 
TAAAAAATCA TCAATTTGTT 
ATGCTTATTA ACATGTTGAA 
AATGATTTTC TATATTTAAA 
ACGTAGCTCA TCGGATAGTT 
CTTTTATTTT AGAATTTTAA 
TGTCGAGGAG GAGACCGATC 
GGAAGAGCGC AGTTTGATTG 
TAGTTATGGC CTAGGATATC 
ACCGATTTGG GGATTTTGTT 
GAGTCCATTC TGTAGGTACG 
CCTAGAGGTA AGGTCAGATT 
GGGATTGTCA CCAATCTCTT 
TTGCCTGAAC GAACGGGCTG 
AAAGAACAGC CGATAGCTGC 
TGCCTGTTCT TGATGTGAGG 
AGCCATACGC TAACACTGTG 
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TATCTAGTGC ATGTATAGCC 
AAGGAAGCCG CATTTGAAAT 
GCATTCCCGA ACACTGTAGA 
TTTCAAAACT TAAGTTTCCT 
GCAAGAATGC CTCCATCCTC 
GATAGAGTTT CCTGTAATTG 
CATTAAGAAG ATTAGAGGGA 
ACATTCAGAC GGAAGCTTGG 
AGACATCTGA CPJ^GhGGCGA 
AACATAGCGC AGTGGATAAA 
CAAACTTTGA TATGAAAATA 
GTCTAGAAGT TTATAAATTT 
TATTACAGAG TTAGTAAACT 
TTTTGAGGGA GTCCTCTACC 
GTTAATTCTA AAGATTTCAA 
GGTACTTCCT GCTTGGAGAT 
CTTGGTAATC CAAGAATAGA 
TGGACTTTGT ACCCTAAAGC 
CCAGGAACCT CCGCTCGCAA 
GATAGAGTAC CGGTTGGTAA 
TGGAATTTTG ACTGCCATTT 
ATAGAAAGGC TTTTGAGAGA 
CGAACGCTGT TTGGTGAGAA 
AGGTGAAGAT AGGATTTCTG 
TGCTAATGTA TGGCTATAAC 
GATGTAGGCT GTGGAGGTGA 
GATGTTGCAA AGGCCTCTTG 
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67451 


GAACCACGGA 


, AGCTCAACAT 


' AAAGTGAAGA 


. GACTGTATTG 


i TGAGCCGAGA 


67501 


CGTTGTTGCT 


TGATCCGATT 


TCTTTAGTGC 


GGGTGAAGAA 


, CTGTGCAAAA 


67551 


CCTAAGGAGA 


TTTTCTGATG 


TAAAGAAGTT 


TCGGAGGATG 


CTTGTAAGGA 


67601 


ATACCCTGTA 


GATTGGATAC 


GGAATCCTGG 


AGCCCCGGGG 


ATGCTATTTT 


67651 


GATGAACAAA 


GAGGCCGTCG 


GCAATCCCTT 


GAATTTCTAA 


GAAAGGCCTC 


67701 


TCGATATCAG 


AATCACCAGT 


TCGATTATAA 


CTTCTTAATA 


GAGAGAACAT 


67751 


AGTATGAAAG 


GATTGCCATA 


GGGGAGTCGT 


AGCAAGATCT 


CCTTGGTATT 


67801 


CAGGATTGAC 


CTTATATCCT 


AAGGGAGTCC 


AATTGGCATA 


CAGAGCTCTG 


67851 


TAGAGGGTGT 


TTGCCGTCTC 


TATAGAAGCG 


TTATTTGTTG 


TTGTTATCGT 


67901 


CTCTACCCAJ\ 


TAAGGAGACC 


AGATGCCTTG 


ATAACCGTAA 


TGCTCAGTCG 


67951 


CATTTAAGCT 


TTCAGGATGA 


AAGTTATCGG 


TATTGATATG 


ACGTGCTGTT 


68001 


ACATCCGATA 


AAGAAAGAAG 


ATGAATGTTT 


TGTAAAGGCT 


CAGAGAGATC 


68051 


TATACTGTCG 


TAGGGATCGC 


GGTTTTCCTC 


ATTTAAGAGT 


GTCAGAGGAC 


68101 


CTGATAAAGT 


AATTGTAGGG 


TTATTGTCCT 


CTGTGAAAGG 


AGCACTAGAT 


68151 


TGTAGAGGAC 


GGATCCACAA 


GGTAGGAGCT 


TTTCCTTTTG 


CTAAGATCGA 


68201 


GGGGAGGTTA 


ATCGCAAGGT 


TATTAJ^TGAT 


GACCTGGGAG 


CCTACACTAG 


68251 


TTGATGGAGT 


CTCAGAGTTG 


GCAGTTGTTG 


CAATACTCGC 


CGCATGCCCT 


68301 


AATTTAAGGA 


TACCTCCTTT 


TTGAGTGAAC 


TTATAGAATT 


GCCATCCCGC 


68351 


ACGATCCTCG 


ATAGAGAGGA 


CACCATTGCG 


AAGTTCAGAG 


GTATTTTTCG 


68401 


AGCTGCTAAT 


GAAATTATTT 


TCGTAGTCAG 


AAGCTTCTGG 


GATATAGGCT 


68451 


GAAGAAAATA 


AGATCGTTCC 


CTGATGGTTC 


GCATTGGGAT 


TAAAGATTAG 


68501 


AGGATTTGTA 


GTTGGATGTT 


GGTGTTCTAT 


AGGATCAAAA 


AAAGCAGTCG 


68551 


TATACCCCTT 


ATTAGCTCCA 


AGTTGTAAGT 


TGCTATTTGG 


TGTACAATGT 


68601 


ATGGCGTTGT 


ATCTACCAAA 


TGTGGTGAGG 


AAAACCTCAT 


TATTTTGAAA 


68651 


TGCGATATTT 


CCTTGTTCCG 


CGAAGAGTAG 


GCAGGTGCTG 


TCCTGTAGGA 


68701 


GCATAAGAGC 


ACCTCCCCAG 


TTTCCTTGAT 


TGTTGGTGAA 


ATATACGTGG 


68751 


CCACTATTTT 


TGATTGTCAA 


AAATTGTGTA 


CAGATAGCTC 


CGCCATCGCG 



68801 AATGCAGTAG TTATTATTGA AAAGAATAGT TCCAGGGTTA TCGTCTATGG 

68851 ATAGGTTTGT TGTATAAATC GCCCCTCCTG AACCATTTCC TGAATTTATC 

68901 GAACCAGATA ACGCTGTGTT GTTATTGAAA ATCACCGACC CGGAGTTATT 

68951 TTTTATCGCA ACAGTAACGC TTGTTTGAAT GGCCCCGCCA TTGTTCCCAC 

69001 AGTTGTTCTT AAAATAAATA GGACGCGTGT TATCAGAGAT CGTTGTATTT 

69051 TCACTACGAA GCGCACCCCC TCCACTAGGG GCTGTATTGT TAAAAAAGAG 

69101 TAGAGGTGCC CTGTTGCTTT GGATGCGGCA GTGTCCATTG GTGGAGAGGG 

69151 CTCCTCCCCA GTTGTTGACG GAATTGTTGA CAAAGTAGAA AGTCCCTTGA 

69201 TTTTGAGAAA TCGTGAAGTC TCCATTACAG GCAATCGCAC CCCCACGAGT 

69251 TTCTCCTCCT GTACTCGCAT TGTTAAGACC TCGATTGCTG AAAAAAATAA 

69301 GGGGTCCTCT ATTCTTCGTG ATTGTGCAGG CTCCCTGGCA AGCAATCGCG 

69351 CCTCCAGTCC CAATCGCGAG ATTTTTACTG AAGAAGGCAT GGTCTTCAAC 

69401 ATTTGATAAT AAGAAATTAT TACAGGACAC AGCTCCCCCA GCCGATGTCC 

69451 AAAGAAGAAG GATGTTATCA ATAGACTTGT AGTTAGAAAG TACAATGTCT 

69501 TGAGAGGAAT TATGTCTATT TCCAACAAAC GTAGTTATTG GAGAAAATCC 

69 551 TGTAAGAGTG GAGAGAGAGT CTAAGAGAGG AAAGCTCGTA CGAAACTCTT 

69 601 CATCCCTCTC TAAAGCAAAC TTTTCAAGGG AGTCCGTTTG TAAACTATAC 

69651 ACTGCAGGAG TCATCCCGAA CATGCAGGCT GTGAAATTCC CGAGATAGAA 

697 01 TAAAAACTTA GGAGGAGTCT TTGACACTAG AGGTTCCTTA ATCTTTCTGT 

697 51 TTTGGTCACT TATTGTTAAA ATCTCATTCT ACTCGCCACG TTTAAGTAGT 

69801 GACTCAGCGT GGAGGAAGAA ATATCCGCAG AGTAATCTAA GGAGAGAGTG 

69851 ACTTTAGGAA ACACCTGCAT GGTATTTTTC ACTTTGATCC CTAAAGCATT 

69901 GTAGGTCACA GGAGTGGCCT GCGTCGTCCA CGTACCTTGG CTAATCAGTA 

69951 ATTTCGAGTG GAGTTCAGGA TCTTGCCTAT AGAGAGTAGA GCGATAGGAA 

7 0001 ATTTCTGTGA GCCAGACTAG GGGAACTCGG TGGTGGTTCT TCCAAGAAGC 

70051 GCGGATTCCT ACAGGGAGGG AGACGTCCGT TAGGGGGCGG TGTAGGGAAA 

70101 ATTCCCGAGC ATGGTCTCCA GATTCTTGAA ACGCAGCAAG ATTTCCTCGG 
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72851 CTTTAGGGAG CTTGTGATGT TTGGGATCGG AGAGCCTCCT TATATCCCTA 

72901 GGGGATGCCC TAGGAACTCT TCCGAAACAC CGAGGGTCTG TTAGAGATAA 

72951 AAACAAAGGA CCATCGGGGA GACTTGTACT CATAAGAGCC ACTTATCTCT 

7 3 001 ATTCCTATAG ATTTTTGCTG ATTTTGATTA TTAGAAAATA ATAGATTTTT 

7 3051 TCTGTTTTTT AAAGTAAAGT ATTTTTTAAA AGACTCATTT TTAATGAGTT 

7 3101 ATTACTTTTC TCTTTGGTAT CTGAAGGTGC AACAGCACTT TCAAGCAGCA 

73151 TTTGATTTTA CTCGCTCCCT GTGTTCACGA ATTTCTAATT TTGCTTTGGG 

73201 AGTGATTGCA TTGCTTCCTA TTATTGGGCA GTTGTATGTA GGGCTGGACT 

7 3251 GGCTCCTCTC TAGGATAAAA AAGCCAGAAT TTCCTTCCGA TGTGGATCAG 

7 3301 ATCGTGCGAG TAGAACACGT CGTGGGTCAC GACCATAGAA GTCGAGTTGA 

73351 AGATATTCTA AAGAGACAAA GGCTCTCATT AGAGCCTAGA GACGAGGGGA 

7 3401 AGGTTCACGG AGATCTGCCT TCAGCTCCTT TTTTTTGATA TCCAAAGTCT 

7 3451 CAAGTTCCTA CAGTTGTTCT CTGAGGGGAC AGCTCTAAAT TTATTTCGTA 

7 3 501 TATTTGCTCC ACTACGCAAC CGTGTGACTA CAGAATACAG TCGTGCTAGG 

7 3551 CAACCCGACC TACATAGAAT TGCCATCGTC TATATAGGAG TTCTCGATTC 

7 3 601 AGAAAGTTCC AAGATCCTAG AGCGGCTAAT CTCTTATATG AGTTGTATCT 

7 3 651 ATTCTGAATC GCAAATGTAT TTAAGATTCT TTATGGGCAA GAATGTAAAT 

7 3701 CAAAGTGCTG TACTCTCAAA ATTACATGTA GAAAATCTGC ACATCCGTTG 

7 3751 TGGGTTTTTC AGCGAGGATG CTGTTCCAGA GAGTGAGCCC TTCGATCTCT 

7 3801 CCATCTACGT GCACACAGAT CGTAGCTGTC CTCTCCCTAC GAAAAAACGG 

73851 AGCAGCTCCT GGGAACTCCA AACTGTAGAA CTCCCAGAGT CAATATATCC 

73901 ACAGTCGGAA TTCCTATTGA TGAGACCTCG AATGCTTTCG TAGACTCTAT 

7 3951 GATGAAACAA GGAGTCGGGC AGGATGCTAA AGAGCTATAC ACATTTCTAT 

74001 CTCGTGGGAA TGAGCATTAC CAACCGTGTC TATGGTTCAG TCTCGAAGAG 

74051 GAACTCGGAT TCCTTTTCGA TGAAAAAATG CTCTGCGCCC CTCTATCTGA 

74101 GGATCACTAT TGCCACTCGT ATCTTGTAGA TCTAGTGGAT CAACATTTAA 

7 4151 AGGATTTAAT ATTATCGATG TTTTTAGATC CTCAGAATAT CTCAGCAGGA 
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74201 GAACTCCTCA AGGTCTCTAT 

74251 ACAGAAAGAT TTCCTCTCGA 

74301 TCGTCGTGGT TTTTAAAGGA 

74351 AAATTAGTAG AGGAATTGAA 

74401 TTCTTGTCAC GGAGATAGTA 

74 4 51 AGGGAACTTC AGGGCGTTAT 

74 501 GATACAGACA TGCGTAGTTT 

74551 TAGAGAGTTT GATCTTGTAG 

74 601 AAATCGATCA TACAAATTGG 

74 651 TTCGCAGATG CTGTAGACGT 

74 7 01 ACTGATTACG CAGGCGAATC 

747 51 TCCCTTCAAA AACCTTCTGG 

74 801 ACTGTCACGA GACACTTCAT 

74 851 GGTATGGACT CATAAACATC 

74901 TAGACTTGAA AACACAGTGC 

74 951 GTCACAAACT CTCACGAAAA 

7 5001 GATTATCGCA GACTGTTCTC 

7 5051 ATGAAGATGT TCCCTCTACC 

7 5101 GATCTTGAAG ACTCTTAATT 

7 5151 AAACTATCGT GTTGTTCTTA 

75201 TAAATCTTTA TATAGCCTGC 

7 5251 CCTATGGCAT GCTATATTTC 

75301 TATTAGGGCT TTTGATTTTA 

7 5351 TTGCTTTGGG GGTCATCAAG 

75401 GGGGTAAGTT GGCTAGTTTC 

7 54 51 GGCATTTACT TCTGACGTTG 

7 5501 GTTATAATCC CCTTGCTTGG 
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AAACGTTGGA GATTCTTTTT CTCCTCTACA 
TGGTCTTACG TGATGAAACG GGAAAAAACG 
GTTCTCTCCT TACCCGCAAC CCAAGTCTGC 
CTCTAAGGAC TACTCCTACC TCAATATATT 
GTCCTCAGCT TTTATTCCGT AAGGAATTAG 
TTTACAGTGA TTTGCGCTTT ATATCTAGGG 
ACAACTTGCT TCTGAAAGGA TCATGGTCTC 
ATGCCTATGC TGCAAGATGC AAGCTCTTGA 
AGACCTGGAA CTTTCAGTCG CCACGCCGAT 
ATCAGCAGGA TTTAACTCAA GAGAATTTAA 
AAGGGATCCT AGAGTCTGGA GAACTCCCGC 
GAAGGATTCT TAGCATTCTG TGATCGAGTG 
TCCAATGTTA GACGCCGCTA TAAAGCAAGC 
CCAGCTTGAT AGATAAAGAG TGTGAAGCCC 
TTGCCATCTA TCGTATCGTA CCTTGAATAT 
AACATCGAAA GGCCCGTTCA TACAAAAAGA 
CTCTTAAAGA GGCGCTCTTC CCAGGTTCTG 
TCTGAGGATC CTTCAGATGA TCATCCTTCG 
AGTTGCGATA GAATTCAATT TTTTATATAA 
TTAAAAGATA GTTAATTTTC TATCTTTTTT 
GTACGCTTTC ATTTTCAATG TTGGTTTGAT 
TATTTGGATA TCTACAGTTA AGCAGCATTT 
CACGTCCTCT TGGTTCTCGG ATTACAAATT 
GCTATTCCCA TTTTAGGATG CGTTGTTATA 
CACATGTTCT GCACGAAGGT TTGGGAAACC 
CTAGTATCGT GAAAATAGAA AAAACTCGAG 
GTGGAACAGT ACTTGAGACA GCTTAGGGTT 
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7 5551 CGACTTCCTG AAGGAGATTT AGGAAAAATC CATGGGAAGG TCTCCAGAGA 

7 5601 TTATGTTTGC GACAGGACTC CCCAAGAAAA TCTGAATATG GTTCCTCATC 

7 5651 AATATCTGGG AGAGCTAGGT CGCGCGTTTT ATGGAATCCG CAACCGAGTA 

7 5701 ACCAAGGCGT ATCAACGAGT CACTCCTCTG GAAGTCCCTT GTCTTACGCT 

75751 CGTCGGTTTT GACATTTTAG ATCCCGAAGA TCAGGTGAAT TTCGTTCGTC 

7 5801 TGGCTAACGG CATACAAACT CAGTACCCCC AAACTCAAAT AAAACTTTAT 

7 5851 TTAATCTCTA TCCAAAAGAT ATGGAATCAG TGTGACGGTA CGATTTCTCA 

75901 AGAAAAAGAA CAGCAACTCC GCTCTCTAGG TTTGGATGCT AAAATCAAAT 

7 5951 GTGTGTCGGC CCCCGCTCTC CTGCTCCAGA AATATCTTCA ATCCGAGAAC 

7 6001 TTGCCTTCCT GTGATCTTCT CATTAATTAT TACGGGAAAC AACAGTCCGT 

7 6051 CAGAGACGTG GACTCTATAA AGAGTCTACT CAATCTTTCT TCCGAACATA 

7 6101 TCCCTGCGAT TTCTGTAACC TATAGACCTG ACGATCCTTT TTATAGCTAC 

7 6151 TATTTCTTTC CTGGTTCTCA AGGAGGAACG GCACCCGATC AGAGGATCCC 

7 6201 TTGGAGTGAG CAGGAGCATC TTCAAACGTA TACCACCCTG TCTAACCCTA 

7 6251 GATGTGATAG ATATGCTGTT CACTTGGGAA TGGAAGATTT TGCCTCTGGA 

7 6301 GTATTTTTAG ATCCTCTTAG GGTTTCGGCT CCTTTATCTG GAGAGTATTC 

7 6351 CTGCCCCTCA TACCTCTTAG ATTTAAAAAG TGAAGAGCTT CGTTGTTTCT 

7 6401 TGTTATCCGC TTTTATAGAT CCCAACAATT CTGGTCAGGG AAATCCGCGT 

7 6451 CCTATGTCCA TAAACTTTGG AAACTCTCCT TTGGGTCAGA GGTGGTCTGA 

7 6501 GTTTCTATCT CGTGTTCTAC ATGATGAAAC AGAAAAGCAT GTGGCTGTAG 

7 6551 TCTGCAATAA TCCACAACTT ATAAAAAAGA GTTTTCCCTC ACATTCTTTA 

7 6601 TCTCTATTAG AGAACGAACT GGAAGAGTCA GGTTATTCTT ATTTGAATAT 

7 6651 CGTTTCAGTG AGTCAGGAAC GCACGTGTGT TAAGGAACGT AGAATTTTAA 

7 6701 GTTCTGATCC TTCGGGGAGG TCATTCACTG TAATCCTCAC TGATCTTCCT 

7 6751 GAAGGGAGTT CGGATATCCG CAACTTGCAG CTAGCGTCAG ATAGGATCTT 

7 6801 AGTTTCTAGT GCTCTCGATG CTGCTGATGC CTGTGCTTCT GAATGTAAGA 

7 6851 TCTTAGAATA TGAGGATCCC GAGCAAGAGT GGGCGCAACA GTATGCGTCG 
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7 6901 TTCTATAGAA 

7 6951 AGGAGAGCCT 

77 001 ACATCGTATT 

77051 AAAAAACGGG 

77101 GCGACGTGCT 

77151 CTATACAGCC 

77201 GACGAGGCTG 

77251 GGAGAACAAT 

773 01 GATTGATGGG 

773 51 AACCTTTTTA 

774 01 CTTAGGTCAT 
77 4 51 TTGTCTATAG 
77 501 TTTGTATAGA 
77 551 ATTTTTCAAA 
77 601 TTATTTGTTT 
77 651 ATACATAAAT 
777 01 GTCGTTCGAC 
777 51 TTCTCGGATT 
77 801 TAGGACACAT 
7 7 851 CACACCGTTC 
77 901 AGTAGAACAA 
77951 TAAGTAGCTT 
7 8001 GGGAGAACCC 
7 8051 CCAACTTCTC 
7 8101 GCGTTCGTAG 
78151 ATTCAAGATC 
78201 CATAAATTTC 
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ACATCGACAG GGCAGGCGAT CTTCAACGTC AGGGGATTCC 
TTAGGGGTCT CAGCATCTAC GAGAGTAGTT TTAGAAAAGG 
CAATCTCAAT GCGGTAATCC AACAGGCCAT GTGGAAGTTT 
ATCTTTTTGC TGTAGAAAGT CAGGCTTTAG GAGATGACAT 
TTAGAAGGTT ATATCGGCAG CAGTCTCTTA GTTGAGGGGA 
TCAAGTCGCA TGTAATGTCA ATGTGAGTTT TGCTACGTTA 
TGTGTGCAGC TTGTGACTCA GCTCAAGATG CACCTTCTGA 
ACAGATGACT AAAGATCGCA ATCTTGTGAA CGAAATCGCA 
AACTAATTAG ACACACCTTT CTAAGGTGTT TGTTTTGATG 
TTAGTCCAGC AGAGCTCTTT TTTGAAGATT CTTCTTTTTT 
TCTGGGTTTT TTGAAGGTAT CGAGGGTTCT TATTGTCTAG 
AGGGTATCGA GGTTTTTTCT CTTAGGTATC CCACGATTCT 
AAAATTTTAT GAAAGCTTGA ACTCTTTACA CTGACTTTTT 
TAAAAACGTT TTTAAAAATA TTATTATCAT AATTAGATAC 
TAATGTCTTA TTTGATTAAA ATAACTTTGT TAAAATTTTT 
TTCTATTGTG GCTTGTCCAA GTATTTCTTC TTGGTTTACT 
AGCATTTTGT AAACGCCTTT GATTTCACCC ATCCCGTTTG 
ACAAATTTTG CTTTGGGGAT CATTAAGGCA ATTCCCGTAT 
TGTCATGGGA ATCGAGTGGT TGATTTCCTG GATTCCCAGA 
GTCATGGAAT GTTTACTTCT GATGTCTCTA GTGCTATTAA 
ACACGGGGTC ATAATTGTTT AGCTCCCCTA GAAGCCTATT 
GAGAGTCCCC ATTTCCCAAG AAGATCTAGG CAAAGTACAC 
CAGAAGATCC CTTCGTAGAT ATCACACCCA CAGAAATTGT 
CCTGATGAAG AACTCTCTAC T.GTAGATGAG GCACTGCAAG 
TAGGTTAACC TATGCCTATA GGTCCGTAGA GAAACCTATG 
TTGCTCTTGT GGGTTTTGGT CTCCGAGATT CTGCGGACCT 
GTGCGTCTTG CTAATGGCGT GCAGAATCAC TATCCCCATA 
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7 8251 CTAAAGTGAA GCTCTATTTA 

7 8301 GAAATTTCTG AAGAGGAAAA 

783 51 TAAAATAGAG AGTATATCCC 

7 84 01 AAGTCGCTAC TGTCGATTTT 

7 8451 GTCCAAGATC CCTAGGTGAT 

78501 GACCCCTTCC ATTTCCGTGC 

7 8551 GCGATCATTC CCCAGAAATA 

7 8601 TCCTTCTCCA CGTATTGTTC 

7 8651 GCTGAATCTA GGAATTCAGA 

7 8701 ATAGGGTTTG CGCGCCTTTA 

7 8751 TTAGATTTGC AAAACAAAGA 

7 8801 AGACCCTAAA AATCTCACTA 

7 8851 TTGGCAACTC TTCGTTTGGA 

7 8901 CTGCACGACG AGAAAGAAAA 

7 8951 ACTTCTGGAA GAAGGATTGT 

7 9001 ACTTAAGAGA ATCAGGGTAT 

7 9051 GAAGGAGTCT CCAAGGTTCA 

7 9101 AGGACGGTCC TTTACTGTCA 

7 9151 ATATCCGTAG TTTACAATTA 

7 9201 CTTGATGCCG CGGATGCATG 

7 9251 AAATCCAAAT GCATCCTGGG 

7 9301 TTGAGAGAAG AAGGTAGTGT 

7 9351 GTGTGATAAA ATCGTGGCAC 

79401 TAATTAAACA GGCCGGTTGG 

7 9451 GTTGAAAGTC AGGCTTTAGG 

7 9501 TATTCAGAGT ATGGTCGGGA 

7 9551 TTAAGTTTTC TGTCGACTTT 
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GCGAAGAACT TGGCAGATGT CTGGGACTGT 
AGGGCAACTC CGAGCTCTAG GTTTAGACCC 
TTACGAGTGC AGGTCTTCCT TCAGTGCCAG 
ATGATTACCT GTTACGGGAA AGATCAGGAA 
ACAACATCTT CTAAACTTTG CTCTAGAAGA 
AATACCAAGA ACAAGAGAAG CTCTCTCCGT 
GGTAAAAAGA AAAGATGGAA TAAGCTGGAA 
TCTGTTTATG TCTGTTAAGG ATCATTATAA 
ATTCCCTGTC AGGGTGGCTT CTGGATCCCT 
TCTTCACCGT ACTCGTGTCC TTCCTATCTT 
GCTACGTCGT TCCCTTCTGT CAACGTTTCT 
GCGAAACATT CCGTTCTGTC TCTATAAACT 
CAGAGATGGT CAGAGTTTCT ATCTCGTGTT 
GCACGTAGCT GTTGTTTGTA ATGATGCAAA 
CCCCAGAGGC ATTGTCTCTA TTAGAAGAAG 
TCGTATCTAA ACATTCTCTC GGTGAGCCCC 
GGAACGTCAG ATTCTAAGGC GAGATCTCCA 
TGATTACAGA TCTTCCTTTA GGTAGCGAAG 
GCCTCGGATA GGATTTTAGT CTCCAGTTCT 
TGCTTCGGGA TGTAAAGTCT TAGTCTACGA 
CTCAGGAATT GGAGAACTTC TACAAACAAG 
TTCTTTCAGA GAATATTTCA GAGCCTATAT 
AGAAGAACTT CTTATTTACT TTAGACGCTG 
AGATCACAAG AGAAACTCAA TTTATTTTAT 
AAGAGAAATC AAAGTCAGCT TAGAGGAATA 
TTTTGGGXTC TCAGAGAACC AAGAAAAGCT 
ACCCCTTTAG AGCAGGCTCT ACAAGAAAGA 
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7 9601 TGCTCTTCTG ATGATGACGA AGATGCAACA GCAACTTCGA CCGCTACAGG 

79651 GGCAACAGCA TCTCCGACTG ACATGCACGA AGATGAGTAA CGTTTGTCTG 

7 97 01 ATACCTTAAA AGTTCCTTGC AAAGGGCTCC CTGAAAACTA AATTCCCTCA 

7 9751 GAATCTCGAA TTCTCCTGAC TCTGAAACAA TCTTAGGTTT TCCTGAATAG 

79801 AATCTGACTG AAATTTCTGC TCGAATCTAA GGGCTGTTTC TTATTTTACC 

79851 CCTAGATGAG GATATTAAAT CCAAGCTAGG ACTTCAAAAG TAGTTGGTTA 

7 9901 TTAGTTTATT AAAGAAAATA ATACTAAAAA TATTTAAAAG CTGTTTATTC 

79951 AATTTAATTG ATATTTTCTA TGTTGTTATT TAAAATTGTT TGTTTCTAAT 

80001 TTTATTTTTT TTGTTGTTAT GCCAATTCCC TATATTTCTT CTTGGATTTC 

80051 TACCGTTCGA CAGCATTTTG TTAAGGCGTT TGATTTCTCT CGTCCCTTTT 

80101 GTTCTAGGGT TACGAATTTT GCTTTAGGGG TCATCAAGGC CATCCCTATT 

80151 GTAGGACATA TTGTCATGGG GATGGAGTGG TTAGTTTCTT CCTGTGTTGC 

80201 CGGGATTATT ACTAGGTCCT CCTTTACCTC AGATGTCGTT CAGATTGTAA 

80251 AGACTGAGAA GGCGTTAGGT CGAGATCATA TATCTCGAGT GGCGGAGATA 

803 01 TTGCAAAGAG AAAGGGGGAC CATAACTCCT GAGAATCAAG ATAAGGTGCA 

803 51 TGGGAAGTTT CCTGTCTGTC CTTTTGGTCG TTTAAAATCC GAGGAAACTT 

804 01 TAAAACTTAA GCCGGGAGAA AGAGAGGGAA CTTTAGATAC TGTATTTTCT 
80451 CCGATTCGCA CGCGCGTGAC TCGTGCGTAC TTACAGGCCC CCCGACCCGA 
80501 AATACGTACG ATTTCTATTG TGGGTTCGAA ACTTAAAACT CCTCAAGATT 
80551 TCTCGCAATT TGTGAGTCTC GCGAATGAAA CGCAGAGACT GCATCCTGAA 
80601 GCGTTAGTTT GTCTGTATTT GACAGGCTTG AATCGCGAAT CTCAGATGTG 
80651 CGATACAACT ACTGCAGAGA AGAAGCAGTA CCTACATAAC TCAGGTCTCG 
80701 ACTCTAGAAT CCAGTGCAAA GACAGTAAAG AAGACGACGC TGGCTCTCCT 
80751 GAAAATCCCG AACTTTGGAT TGGCTATTAT TCACGAGAGC AACAGCATAA 
80801 TATAGACGGG CAGTATATTC AGCAGTGTCT AGGGAAGAGT GCAGATCCAA 
80851 TTCCTTGGAT TCATGTTACT GAAGACACAA AGGATTTTTA TTACCCACCA 
80901 AACTTTACTT CATACTCACA TACAAGACAA TCTACAGACC CAACATCGCC 
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80951 ACCAAGACTC 

81001 TGAGTCGATC 

81051 GAGGATGCAG 

81101 CCAAGGGCAT 

81151 TACGAACTTT 

81201 GAGGATCTTC 

81251 GGACTCGTTA 

81301 TAGTTACCCT 

81351 TCAATGAACA 

814 01 GAACATTTTC 

814 51 TCTTTGGAGA 

81501 GATCCCATTA 

81551 TATGGTTGCT 

81601 GTTCCTGCAT 

81651 AGACAATGGG 

817 01 ACCAGTAATT 

817 51 AGAATTTTGT 

81801 TTCCGTTCGA 

81851 GTTCTTAACT 

81901 ATATGGGGAA 

819 51 GAGCTAGATC 

82001 AGGCAATGAT 

82051 CCTCGCAAAA 

82101 ATCTCCAGAT 

82151 TGCCTAGGAA 

82201 AGAGGACTCT 

82251 CTGATTTCCA 
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CCTGAAAGTG AGGGGGATAA GGATTCCTTG TACGGACAAC 
GTATCACCAT GAGTATATGC TTGGTTTGGG ATTAAAACCA 
GACTCCTGAT GGACCCGGAT AGAATCTATG CTCCTCTATC 
TATTGTCATT CCTACCTTGC GGATATAGAA AATGAGGATC 
AGTCCTTTCG CCTTTCCTAG ATCCTGGCAA TCTTAGTAGC 
GTCCTGTAGC ATTCAATATC GCTAGATTGC CATTAGAATT 
TTTTTCCGCC TTGTTGCGGG TCAGCAAGAA GGGAGAAACA 
TGCCCACGGA ACTCCTCGTC CAGAAGATCT TGATCCTGAC 
TTCTGACCAG AAGATTACAA ATGTCTGGAT ATAGCTATTT 
TCCTATAAAT CACGGAAAAT GATTGTAAAA GAACGTCAGT 
TCGTTCTGAA GGGAAGTCTT TCACATTGAT CTTATTTGAG 
GTGCAGCAGA TTTCCGTTGT TTGCAGCTAG CTGCAGAAGG 
AAGGATCTCC CCAGCGTAGC AGATATTTGT GCCTCTGGAT 
TCAGTTTTCT GAGATGCAGA GTCCTCAGGC TATTGAATAT 
AGGCACGTGT CGAAGATGAA GCAGGAGAAG AAGCCAGAGA 
TATTCTCAGG ATCAATTGAG CAGCATGCTC ACTACACAAC 
ATTTTCTCTA GATGCTGTGG TAAAACAGGC GATCTGGAGA 
AAGGTCTTCT TACTATGGAA AGAAAGGCAC TAGGCGAGGA 
GCGATATTTT CCTATTTAGG GAGTCAGGAG CGTAATGAGA 
AAGAACTACC GAAGAACATG AGGTCGTTAT CAGCTTCGAA 
GCATGGTGCA AGTCCTCCCA GCCGAAGTCC CTGCAGATTC 
CCTACGCGTC CCGTTCCTAA TCCAGATAGT AACCCTGATT 
TGAAGGCAGT TAGAAAGTAA AAATACTAGA GAAATTTCTT 
GGAATCCGTG GTCCATGTAC CTAGGATTCC AGGAAGGGTT 
TATCTTAATT TCACAAACCC CAGGAGTGCA CAAACTCCTT 
CTCTATATTT GTTCTTTCTA CACTAGTATT CCGTAGATTT 
GGATAGGATT CTAAATAGTT GTATCTAAGC GCTTTTTACA 
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823 01 ACCTTTCTCA GGTCTCCTCT 

823 51 GGCTATAGGA GTCGCTAAGT 

824 01 TTAAATACTT CTTTCCTAAG 
824 51 GTTTATCTGC CTGGTTTTCT 
82 501 GATTTTTCTT TGCCGTTTTG 
82551 CATCAAGGGG ATCCCTGTTG 
82601 TCGTTTCTAG GTATTTAGAG 
82651 GATGTGGTGA GTCTTCTGAA 
827 01 TGCTCGTGTA GTGGAGACTT 
827 51 AAGATGAGGA TAAGGTCCAT 
82 8 01 ATCCAACCTG TAGAAGTTCT 

82 851 GTTAGGGCTT GCCTTCTCTA 
82901 TGCAAGCTCC ACGGCCAAAA 
829 51 ATGAATCCTT TTGAAGTTGA 

83 0 01 TGAAACTCAA AGACTCTATC 
83 051 CTTCTGGTGG TCGCAATGCT 
83101 GATTGCGAAC TAAACCCCAA 
83151 TGTAGTCAAA CAAGCAACTT 
83201 ATGATCAAGG TACGTTGAAT 
83251 GAGGAAACCC CTTGGATTCA 
83301' GTGGGATTTC TCTCCATTTT 

833 51 AAGCTCTAGA GTACTCTGAA 

834 01 GTATACGTAG GAGAGCGCTC 
83451 TCGGTCAGGG ATCTTGATGG 
83 50.1 AAGGGCATTA TTGTCATTCC 
83 551 CAAAAAACAA TTTTAGCGGC 
83 601 CATACTGCAA CCTATATCTC 
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TTCTAATTTT AAAAATAGCG AATTTCTTGT 
ATCTTAGAGA TTGCTTTTTT CAAGTTTTTT 
TATTTCTTCC TAGTCGTGTT ATGGCTTCTT 
ATAGTTCGTG AGCACTTTTA TCGAGCCTTT 
TGCTCGTATT ACGGAATTTG TATTAGGGGT 
TGGGTCACAT TATTGTTGGG ATAGAGTGGC 
AGTTTCGTGA CCAAGCCGAC ATTTGTCTCT 
AACAGAGAAA GTTGCTGGTC GCGATCACAT 
TGAAGAGGCA GAGAGTCGCT GTGGCTCCTG 
GGGAAGATTC CTGTGCATCC TTTCGGGGGA 
CACTCTCTAT CCCGAAGTTC AAGATGCAAC 
AAATTCGTAA TCGTGTAAGA CAGGCGTATT 
CTGCAGAAGA TTTACATCAT AGGAAACGAT 
CGACTTCTTG CATCTAGCCC GTCTCTGTAA 
CTGACGCTAC GATTTCTCTA TATCTAACAG 
ATGGACAAAA AGAATCGGAA GTTACTTAGT 
GATTGCTTGT TTGGACTTTA ATCAGGGTGA 
GTGACTGTTG GATGGTGTAT CATGGGGAGA 
CAGATTCAGG AAGAGTTAGA AAAGTCAGGG 
TGTGGGGCAA AAGCCTCTTT CACAATCCTT 
CATCTTTGGA GATGAAGGGA GATAAAGAGA 
TTAGAAAAAG AACAGCTATA TTCTCGATTG 
TTCGGTTCTT AGTTTGGGGT TTGGAGATAG 
ACCCAAAACG GGTGCATGCT CCCTTATCTG 
TACCTTGCAG ACTTAGAAAA TCCCGGGTTA 
ATTTCTGAAT CCTAAGGAGT TGAGCAGTAC 
TAAATCTTAT CTTAAATAGC AAAACTTACT 
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83 651 TAAGGCAGCA CTTTGGCTTT 

83701 GTGGTTGTCG TTGTATGTGA 

837 51 GCCAAGCTTC CAACACTTTA 

83801 ACTTCAATAT TTTTGCCTTT 

83 851 AGGATCTTAA ATGAAAGTTC 
83901 TGAGGATTCA GTATCTCAAG 
83951 AAGGAATGCT TTGTGGTAAA 

84 001 GGATGCGCGA ACTTTATGAT 
84 051 TAATCTGTGG AATAGAAAGC 
84101 AGAAACAAGA AGCTGCTTTG 
84151 AATCAGCTGA CGGCGCAACA 
84 201 CCGCCAGTCT ATATGGAGAT 
84251 GACGGGCGTT AGGGGAACAA 
84 301 ACGCAAAAAA AGATCCTCAG 
84 351 TTGTCTATAG ATTTCACAGC 
844 01 GACTCTTAGC AGAGCTCCTT 
844 51 CTCACTTAGG AGCCAACAAA 
84501 ATAACGATAC GCAATGTCTA 
84 551 GCGTGTCTTT GTGCCTATGG 
84 601 TAATTGTCAG AACCTATTTC 
84651 AAACGTGCTC AACAAGAGCG 
847 01 CAAATTGTTT ATTTTGTATT 
84751 TTGAATCTAA ACGTAAATTT 
84801 TCTGTAGTTC GTCATCACTT 
84851 ATATTCTCGA ATTACCCACT 
84901 TTGTAGGGCA TCTTGTTATG 
84951 GAGAGGGGAG TCTCACACCC 
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TTTGAGAGGA TGAGCAGAAG TGATCGCAAT 
TTCTTGGTGG GGTACCGACT GGAAGGAGGA 
TTATGGAGCT AGAGTGTCGA GGGTATTCGC 
AGATCTAATA GCATGTGTGT AGAAGAACGT 
TCAAGAGAAA GCCTTTACCA TGATTTTCTG 
GAGATATCCG CTGTTTGCAT TTGGCGTCTG 
GAGTGCTATG CTGTCGATGT CTATACGTCA 
GGAAGAAGTC TTAACTTTGG AGCGAGAATC 
ATGGTCTTTG GAAAAGAGAA GTTAGAAAAC 
GATCAAGACG AGAGCGAGAT TTACGTTTGT 
GAACTTCGCT TGTTCTTGAG ATGCTGCAAT 
CCCGTATGCC AGAACTTCTC TCTATTGAGA 
CTCTTTACTA CTGTACATCA CTACCTAACA 
GGGAATCTAG AAACGCAGCA ATCCGCGCAA 
ATTAGATGAA GCTGTTGAAT CTCTAGGATC 
CAGAAATATC TCCAATTCCA GAGGAGGAAG 
TAGAGACAAA GAAAATTCGA CGGTTTGAGG 
AGCTTTGAAT CAGGATACTC TGCTTTACAG 
TCTCTCTCTC ATAACAGAGT CTCTCAAATC 
CCCTAAGAAT CGATAACGTA TTTGTTAGGA 
TTTTATTTTC AGTGTACTTG ATGTCTAATA 
TTTGGCACAT CATTTAAATT CCTTGTACTT 
CTTATGACTC ATTGCTTACA TGGTTGGTTT 
TGTGCAGGCG TTTAATTTCT CACGTCCTTT 
TCGCTTTAGG GGTGATTAAG GCCATCCCCA 
GGAGTCGATT GGTTGATCTC TCATTGCTTC 
TGGGTTCCCT TCAGATATTG CTCCTATACT 
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85001 GAAAGTAGAA 

85051 AGCTAAAGAG 

85101 CACGGGCAAT 

85151 TCTTAAACTC 

85201 CTAGAGTTCG 

85251 CAGTTGGACT 

85301 ACAAGAGAAT 

85351 AATCAAAGAC 

854 01 GGGGATATCT 

85451 TTCTGAAGTT 

85501 AAGATACGAA 

85551 TACTTAAGGG 

85601 TACTGTTCCC 

85651 CTCGGCTTTC 

857 01 AAAG AG ATTT 

85751 TGGAGCCCAG 

85801 CTTTATCTCA 

85851 GAAGAGCTGA 

859 01 AAGTAAGAAA 

85951 CCGTAGAGTG 

86001 AGAAAGACGT 

86051 CTGCCTCCTG 

86101 CTATAGCTAT 

86151 TTCAACAACG 

86201 GTGATCTCAG 

86251 GGCATCCGAA 

86301 ATGCTTCAGG 



AAGATCGCGG GCCGAGATCA 
CCTTAGGAAA ACTATCGAGG 
ATCAAGAGAA TCCTTATGCA 
GATAAGGGAG TTCATGTTAG 
CAATCGCATC ACCAGATCCT 
CTATAGCTAT TGTTGGTATA 
TTAGTACGCT TGGCGAATGA 
AACTCTATAT CTTCTTATCG 
CCTCTGATAA GGAAAAACAG 
CAGTGTCTTT CCGTCTTGGA 
ACACTTTGAC CTTATGGTCG 
AGGGTAAAAT TTTACAGCAG 
TGGGTGAATG TTATGCACAC 
CTTACCTATA AATACCGAAA 
CTCGTACACA CCATCAGTTG 
GATTCAGGAT TGCTCTTAGA 
AGGGTCTCAC TGCCATTCCT 
AAATTTTGTT ATTTTCAGCA 
GAGCTTCGTG AGGTATCTCT 
TGGCTGCGCT TTTTACTTTT 
AGTTGTCGTT TGTAATCATT 
AAGCAGTCTC TCAGCTTATT 
CTGAATGTAG TGCGTTGTGA 
TCTGCTATTG AATGCCGATG 
AQCTTCCTGA AGGGCACCCC 
AGAATTTTTG TTTCTCGTGA 
ATGTAAAGTG GTCGCTTTCG 
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TATTTCTAGA ATCGAAAATC 
TTGAAGATCT AGATAAAGTC 
GATATGGCCT CTAGTGAGGT 
CGAGCTTGGC AAAGCCTTTT 
ATAGTTATGC CCCTACTCCT 
GATCTCGTCA GTCCTGAAGA 
GGTCATTCAA CTCTATCCCA 
ATTTTAATAA GGAGTGGGTA 
CTCCGTTCTC TAGGTCTACA 
ACCTCAGGGT GCCGAGGGCG 
GCTGTTATGG GAAGGATTCT 
GCCCTAGGGA CTTCGTTAGG 
ATTGCCATCT AGGTATAGAT 
AGGATAAGAC AGAGCTTTAT 
CATACTTTGG GAATGGGACT 
CCGGCAACGA CTCCATGCTC 
ATCTTGCAGA TCTCACCCAT 
TTTGTGGATG CTAAGAACAT 
AAATTTTGCT AACGATACTT 
AGTGTCCTAT GATGAGAAGG 
CTGAACCTAA TATCCTCGGC 
GAAGAGCTTA GCGATGAAGG 
TCTCTCCGGG GAGACTACGG 
AAGGGAGATC TATGACGGTG 
GATATTCGGA ATTTGCAGTT 
AAAAGAAGCT GCTGATGCCT 
ATGATGAGCA TCTCCCTTGG 
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86351 GTCTCCAGTC ATATTGCCTA CGCGGAGGAG ATGAGAGAGA AACAAGAACA 

86401 AACAATGCAA GGGTCTTTAA CTGAAGAGCA GTTAGGAGCA CTCCTCTGCA 

86451 ACACAGTCTC CACAGAGAAA AATCTAGCCT TTGCTCTAGA CGCCGTGATA 

86501 AAACAGTCTG TGTGGAGATT CCGCAATCCG GATCTTTTTG CTTATGAGAG 

86551 AGAAGCTCTA GAGGCTTCAG TAACAGATGC TTTAGTATCT TACGTTTCAA 

86601 ATTTAGACAT GATACCGTAC ACAAGTTCTC AGGGCATAGT CATAGAAGAT 

86651 AGTAGTATCG TCCGTACCTC TCAAGAGCAT ACACTCATTG TGAACTGTGC 

86701 AGCATTCGAT AAGTTAGCGA GCCAAATAGA GTTCTTATGC CCCAGTGACG 

86751 TGTTGCCCAT TTCTGGTAAA GACCCTTTGA TTTCTGATGA TGAGGATGAG 

86801 GAACTGAATC CTAAAGTTTC ATCTGCTGCA GACTCTAAAG ATAAAACCTA 

86851 GGGAGTGAAT TCTACACGAG AATCGAGAGG AGAGCGAGTC TTTCAAGGAT 

86901 TCATAATCCT TGTTAACGTA TGCATAAACA AGTGAAGCCA TTGCAACGTG 

86951 AAGTAATCGC ATTGATGAAA GATGCTTTCC CTAGGGATGC AAAGCAGATC 

87 001 GTATTCCCTT CTTTCCAAAA CAGACTATAG ATTCAAAAGA TATTCTTTCT 

87 051 TTTCAAATAG ACTTGAGAGA GGGGGGGGGG TTCTCATAGA GTGAGAATCT 

87101 TGGCCTTCAT TGCTAAGTTC TTCGATGATG GATAGAGGAT TCTAAGACGA 

87151 CCGGGGCTAC AGAAAACTCT AAGCAGAGCT TAGAGTTTTA AAATGTGGAT 

87201 TTTAGTCCTG TAGACACTCG GTGGTTTGTA AATCCATTTT TCCCGTCAAA 

87251 GGTATAGTTT AGAAAGGCCT GAGTGTCCTC GGTGAGATCT ACTACATCAT 

87301 GG ATTTGTAC AAACAAACCG TGGCGGCGTA GATTTGCTCC TGAGATCGAA 

87351 GTGCTCTCTT GGTTTGAGAC GACAGTCACA ATATTGTGAG GGTTGACACG 

874 01 ATAGATATCA GGCTTGTAGG CAAGTTTGAT GGTCAGTGTA GAAGGAGCCT 

87451 TCTTAAATGG 'TGrGAACCAT TGAGAAGAAC ATCCTATCGG TAGGGAAACA 

87 501 TTGTACCCTT TACCTCTACT AAAGCTACGT TGCAGATCTC CAGTTTCTGT 

87551 GAACTTACTT TGCCAACCAC CTAGGAATTC TGCGGAAATA AAACCTGAAA 

87 601 GATCCCAAGC TTGAGCCAGA GGTCTTGTAA GAAGACACCA GTTTAGGAAA 

87651 GGATGTTCTG CAGAAATAAG AACATAGTAA CTATTGTTAT GCCATTGCCC 
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877 01 TTGAGATTTT GGAGCTTTGT 

87 7 51 TTTTGGAATA ACCATAAGCT 

87 801 TGAGTCACGA TAGGGAATTG 

87 851 TGAGCAACGT GAATCGTAGG 

87901 CGAAAGAAAG TCCTAACGTA 

87 951 GCTTGGTAGC CTCCATAGCG 

88001 TGTGTGTTCG ACATAGGCTC 

88051 GATGATCTAT CAAAAGAACA 

88101 CCTAAGAAGG AAATCCATAA 

88151 GGGATCTAAG ATATAGGTAG' 

88201 CATAGAGAGT ATTTGCGCTA 

88251 GGAGCTGGAA TTAACAGGGG 

88301 GTAGCCGTAG TGGCTTGGAG 

88351 TAGTAACGGT TGCTCCTTTG 

884 01 TGTAATGACA COAT ATC ATA 

88451 CAGAGCTCCT GTTAAAGTGA 

88501 AAACAAAATC TCTTTTTAGG 

88551 ATCGTAAAGT CTACAGCGGC 

88601 GGTTCCTCCA GAGCCCAGGG 

88651 ' CAAGAACATT GACAACCGCA 

88701 ATCTTGACTG TTCCTAGAAG 

88751 TTCTGTAGAG GAGAGTCCCT 

88801 CATTAGCGTT GATTGTAATG 

88851 CTATGTGTAA TGGGATCATA 

88901 TAGAGACACA ATCTCTCCCC 

88951 CAGGTTTTGT ATTGAGCATG 

89 001 GCCAAAATCG TAAGTGTCGA 
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CAGGTCTGAG GTAGGTGGTA TTTAGGTGAT 
GCTTTCCAGG AAATTAAGGC CTCGCTCTTT 
ACCAAAGAAC GAGAGTAAAT ACATTTGTTC 
GGTTGGCGTT AGTTTTTCCA TAAAGCTGCC 
GTGTGGTCCG TGTAGTTCAT AGATAGCGCA 
ACCTGAAAAG CCCTCATGTC CTTGTCTTGG 
CTAAAGCTTT CGCGGTTATG GACAACCCGG 
TCTTGGAGAA TATCAGAGAA TGCCTGATTT 
GCTGTTGCTG ACAATTTCTC CGTAACGCTC 
AACGCACGAG AGTGTCTGAA TTCCATACAG 
GGAGAGGGAC CTCCAGGAAA TCCTCCATCA 
ACGGGACCAT GTGTAGGACC ACTTTCCTTG 
TCGCAATCTC CCCATCAGGA AATCCTGTCT 
AAAACAGCGA TAGGAATTGC TACTGGAGTT 
AAGATCTGTA ACGTCATGTT CATCAAGAAC 
CGTTTTTTGT GCCTGCATTT ACTGATGCTG 
AAGGAAAAAG GATCGAATGC TAACTTTCCA 
AGGTGCTCCC GTGGGTGTTG CCAGCCCTAA 
TAAGCTGACC TGAGCCCTGA GTAGCGAAGC 
TTGTCAGTAA TCTTCAGTTC TCCACTAGCG 
TATAGTTGTC GTGTTGGCAG GCAACAGGAG 
TACTTGTAAA GACTACAGAT CCTGAAGCGC 
TCTTTATTAG ACGGACTTGT GGTTGGGAGG 
AAATACAAGA CGTGAGCCTC CTTGTGCAGA 
CTGCTTCTAC AGTGATGGCA TTGCGGATTC 
TTTCCTTGGA ACGCAATATC ACCTTCGGAA 
TGTTTGCTTC GCAGGGTCTC CAACAGAGGG 
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89051 GCCTATGAAA ATAGCTCCAC CCTTCTCCGC GCGGTTTCTA GAGAATTCAA 

89101 TAGGACCGCG TCCTTGAATA TTGAGCACTT TAGCACAGAT GGCTCCGCCA 

89151 TTTCTCGCTG CAGTGTTGCT ATCTAGGAGC AAGGCACCCT GGTTCCCTAC 

89201 GATGTTGCAG GTTTCGGCGT AGATCGCACC CGCATCATTT GGTGTACCAT 

89251 TATAAGAGAA GGTGCACTTC CCCTGATTGT TTTTTAATTC GAAAGTTCCC 

89301 GTAGGGATGC AGATGGCTCC GCCACTTCCT AAAGCATAGG TTCCTGATGA 

893 51 AGGGGTGACG CCTTTTAAGC TGTTCACACA GGAGTTGGCG GTGAAGATGA 

894 01 TGCATCCAGA ATTGTTTTCA AAAAGGAGAG AGCTTCCTCC ATAAATCACG 
894 51 CCGCCTCCTG ACGAAGTAAA GTTATGGGAG AAGCTCATGG TAGCGGTGTT 
89501 ATTGATGAAT TTAACGACTG CCGTGGGAGA ACTGCTAATT GCAGAGCCGA 
89551 AGTTGGCAAG GTTCCCAAAA AACTTGATCG ATTGGGATAT GTTTTCGACA 
89601 GTGAGTGAGT CTGTAGGTTG GAGAGCAGAG GCTCCTGTAG TTACTGTAGT 
89651 GATTGCGGGA GTTGCCGACG TAGTTACAGA TGCTGGAGAA TAGGCAACAC 
89701 GGTTCGTAGT GAAGATCAAA TCTTTGATAG ATTGGAAAAC AATATCTTTT 
897 51 CCGTAGATGA GGCCGCCAAG TCCTGTCGAT TGGTTCCCTG TAAAGTTTAT 
89801 AGAAGAAAAA TTCTTGAAGG TCAGAGTCTC TGCGGCAGAA AGTAGCGCAT 
89851 AGTTACTATT TGTAGGAGCT TGACAAGAGG TAAACGTTAA GGAAAGGCCC 
89901 TTCCCTAATA AGGAGAGATT GCTGGAACTA TTGATAAAGG AACTTGAAAA 
89951 ACTCGCCCCC AAGAAGTTTG AACAAACAAA ATCTGAAGTG AGTAAGATCT 
90001 CTTCACCCTG ATTCGTAATT GGAGGAACAA AGTTCCCTCC GAGTCTAGTC 
90051 TCAGCAAACG CGCAACTTGC ACT AC AT AAA CAGGCAAGTA GACAAAAAGA 
90101 TGAAGATTTG AAAGAAAGAG GCATGCCTCA ACCCTGTCGT TAAATAAGGT 
90151 TTAAAATCGT AATTTGCTTC CTATATCTAG AGTATATTGA CGGGAGGTTC 
90201 TGCGAATATC ACATCCACAG TGCCCGAAGA TCTCTAGAAC ATCATTTACT 
90251 GAAGTATGGC TGGATGCTTG TACTAGCAAA GTACTTCTGG TTAGATTATT 
90301 CCCTATAGAG GTCCACGTAG CTCCATTAAT AGGTAATGTC GTATCGCAAT 
903 51 CAGGATTGTG ACGATAGACA TCAGGAGCAT AGGCTACGAT TATAGTGTAA 

67 



wo 00/27994 PCT/US99/26923 

90401 AAATCTGGTC GATTATGAGA ATTTTTACCA AAGCGGACGC CTACGGGAAC 

90451 TGCAACGTTG AGTAGATGAC CGTGTCCAAA AATCCTCCCC TCGGGGGTAT 

90501 TTTCTTGGAT GCCCCCATGA GTCGCGTAAG CAACTTCAGC TTTTACAAAG 

90551 GGAATGATCT GCTTGAGGTT TAAGATGCGA GAAGAGAGAG TGATGGGAAG 

90601 GTTCCCTTCG AGTTCTCCTA ACCAGCAATT GTTACTCCAA GAGCAGCGCC 

90651 CTTTAGGCAA TTTTGTATAT GAAGTCTTTA CTTTCTCATT GCTACGGCTA 

90701 TAGGTAACTC GAGAAGTGCC TCCTGAGAAG AATCTCGATG ATCCAAACAG 

90751 AGACTTGGTG ATGTTAGAGT ATACTGTAGC GAAATAAACG TTAGAATGAC 

90801 CGTGACCTAC GAGGTAATCC TTAGATTTTG TAAACAGCTG TCCAAAACCT 

90851 AGACTTAACG CAGCATCTGG AGTGATGCGT GTGTAGGTAT TGATGAGGTA 

90901 GCCTCCACCC ATATGGCGGT AGCTGCGTGC ATCACCGGTA TGATTCGCAT 

90951 GGAAGAAATT TGTAATTCCT GTGATGCTCA GTTGCTTCCC AGGGACATCT 

91001 TCGCCATCAG CTGCTGACGC TTGACTTACA GCTCGTAAAT CTATGACGTT 

91051 TGCCCATAGG CTATTAGGAA TGAGGGGAGC AAGACGCTCC GGATGAGGAA 

91101 GATAACCGGT TTTTTTCCAA TTTCCTGTGA CCGTATGGGT TGTCGTGTCT 

91151 ATGGTAAACT CCCAAGTTCC TTGATACCCA TAGGGAGATT GCTGATAGCC ' 

912 01 GTTTGTGCCG AGACTGAAGT CCGTAGTGGT TACAGTATTT GAAGTCGCTT 

912 51 TGAGTTCTAA AATCGGAACT TGCTGTAAAT CTTTATTAAA CATCCCGTGG 
91301 TTGTCACAGC AATCTTGAGA GTTTTTCACA AGTCCTAAAG TTCCGGATAT 

913 51 AGTGAGAGCT CCATTGGTAC TCTGCACATT AACGACAGCC GCTTTAGTGC 

914 01 CATCCAAAGA ATCCAGATTG ATTACAAGCT TGTTTAAGGT GATAGCACCG 
914 51 TCAGTATTAT TAGCTCCATT TGTAGTTGCT AATGTGGTCC CTGCATCCAT 
91501 GATGACGACG GACTTTTCAT CTTGCGTGAA GTTATGAACA TTTAAGGTAG 
91551 CACCGTTTCT TAAAGCGAGA GTACCGCCTT CAAGTTCTAG CTTTTGGTTT 
91601 AATGTAGATG TAGCATTTGC AGGGGTTGCT GCTTCGGTAG CAGTGAGGGT 
91651 TTCTCCTGAA AAGACAATAG TCCCTGAATA CGCACCATCT GCACTGGCTT 
917 01 TGGGATTGAC GACCACAGTA GCGGCTGCGG ATGCAGCAGA TAAATCATCA 
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917 51 GATGTAATCG GATCATAGAA GTATAGGGTA TAGCCTTGCG TAGCTCCTAG 

91801 AGTGGCAAAC TTGGCATCTT TTCCGAAGTG AATACTATTG CGAGTAGGTG 

91851 TCCCACTAGT GATGCTGAGG TTCTTGTTAA AGAGGATATC ACCTTGATTT 

91901 GCGGATAGAG AGAGCTCCCC AGATTCGGGA ATAGCAATAG CGCCTCCTTT 

91951 TCCAGCTGTA TTTCCAAGAA ATATTGTAGA TTTATTAGAA TCTATAGAGA 

92 001 TCTTTTTGCC ATAGAGGGCT CCTCCTTGTT CGGAGGCAAT GTTTTCTAGG 

92051 AATGTAACGC TGTTTTCTCC AGATATAGTC AGGCTAACAC CTGTTGGTGG 

92101 GGGGGTAGCT GGAGGAGTAC AGAAGATGGC GCCTCCATAT CCTAACAAAG 

92151 GAGTGACTGC TGGTGGTGTA GGTGGAGGTG TAGGTGCAGG TAAGGAGTTT 

92201 TGTGGAGATG CTGTATTGTT TTGGAAAGTC AGGTCGCTGT TATTAGAAAA 

922 51 TGTGACATTT CCGTTAGCAT AGATAGCGCC TCCTGAGCGC GAGCTATTAT 

923 01 TAACGAACAA GACTCCTGAG AGGTTCCCAG AGGTGAGCAT AGATCCTCCG 

923 51 GTAAGGTAAA TAGCCCCACC ATAGATCCCT GTAGCATTCG TTGAGAAAAT 

924 01 CACAGGAGCG CTATTGTTGA TGAGGTTGAT CGCTGCAGAT CCCGTGAGGG 
924 51 CCCCTCCATT AGAGATGGAT CCATTACCAT TAAAGAGAAG GCTC.TTTTTC 
92 501 GTATTTTCTA TTGTGATGCT TGTGCCTCGA ATGGCAGCTC CAAATCCTGC 
92551 AGAACGGTTG TATTGGAATA GTATGGAGTC ATTGTTTGTA AAGAGCATGG 

92 601 GCGTTGTAGC GTAAATCGCC GATGCGTGAG GTATGACATT ACTCGCTGAG 
92651 GTATCTGAAG TCAAAGATTC ACAGTTATCG AAGATCATCT GACTAAATCC 
927 01 TGAAAAACTC AAGGGACATA GTTCAGGATT TTGGGTGATT ACACTACTAA 
92751 TCGCGGCTCC GTCAGCTGAA GAACGGATAT TTAAGAAGGA GAAAACCCCA 
92801 CCTTTTCCTA AGATTTGTAG TGCTCCCGCC CTATTGCTAA AGCAACTGGA 
92851 AGAGGTTCTG GATATGGCAT TATCAAGATT CGCAATGTAG AGATCCCCTG 
92901 AAAAAATACA GAGTGTCCCT CTAGGATCAG AAAGTGTTGT GTAAGGAAAA 
92951 ATCTTCCCAC TCGATCCATC AAAGTTCTCG GAAGGCATGA TAACTTCTAC 

93 001 AGTAAACGCT GTTGAAGCAA AACATGGCGC CAGTGTGGTA GAAATTAAGA 
93051 ACTTACGAAT AGACGTTTTC ATTTGCACGT AGAGATGAAA CCAGATTATC 
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93101 CTACAAATAA GGGAAAGGCT GTAAAAAAAC AAGTACAATA AGACACAGTT 

93151 TTAATCTCTT AATTTTGACA GCTTTAAGAT TACAGGATAT TTTAAAGGGC 

932 01 ATTTTCCCAT TTCTTACATT GCTTTCTTAG AAGAATACTT GATAGAAAAT 
93251 GGCGATTCTA TTTTTGAAAA ATCTCAAGAA ATTCTCCCAA ACGAAGATGT 

933 01 TTTAGAGAAC CTGTAAAGTA GAAAATGGCG CTACGCCCGA GACTCGTGAA 

933 51 CATGTGTGCA TAGAGGGACC TATCTCTGAA TACAGATAGG TCCCAAAATC 

934 01 TTCTTAAAGG .GGATTCCCTT AATTATAGTG TAGACTTAGA GATTATGGCG 
934 51 TAGAAGTGAT CTCAGGATTC CAAGTGAGGA AAACAGTTTT CTTAGTTGGT 
93 501 GTGATCCTTC TGTCTTTATC CAGAGAAGGG GGGTACTCTT CCCAGCTTAG 
93 551 GGACCAGAAA- CCTCGAACGG CGTCATTCTC TGTTGTGACT TGGGTTCTCT 
93601 CCAAAGTGGT CTCCCCTAGG AGAAGACTGT CAAAAGAAGG CCCTAGAAGT 
93651 TCAAGAAGAG GAATCGTAAA GGCTTCCTTA AATTGAGGAA AGTCATAGAC 
93701 GGACCAATAA GCCTCAAAGG CAACTTTAAG TTTTTCTAAT GAGACAAGAG 
937 51 CATCCTTGTC CTCGGCGCGA ATCCTTACAG GAACAAAGTT GTCGGTATCT 
93801 TCAATCAGGA TGTGCAGATT CTGAACCCGA GCATCTCCTG AGCAGAGCAG 

93 851 AGTGGTTCCT GGAGACATAG TAAGCGTAGA GCTTGCTTCC TGCTTAAAAG 
93901 AATGCAGTTG CAAGGTAACC CCATCCGATA GAGAGAGAGT TCCTCCTGCT 
939 51 AATGTGACAT CTTGTAGGAT TGTGGAAGTA AGATTTTCCG CACAAACTTC 

94 001 ATGATCATCC AGGCATAGTC CTGAGAAGCT AATTGTTCCT TCATAAGTTT 
94051 CCTTTCCTTC AGGAGCATTG ATTACAAGAT CTGTAATTTT ATGCGACTCG 
94101 CTATGGCTTA TAGGATCATA GAAATAAACT CCGGATTCTG AAACAGCACG 
94151 TAGGTTCTTA AACTGTGCTC CAGATTGCAG ATGGATGGAG TTGTGTATTG 
94201 TATTTCCGTC TTGTGATGCT GTATTTCCTT TGAAGATGAG ATCTCCGCTT 
94251 TTCACGGATA TAGAGATCGA TCCTCCAGGA GCAATGGCAA TGGCTCCTCC 
943 01 ATTACTATTC ACGTCATGAT AAGCATGATT ATTTTCAAAA CACGAAGGTC 

943 51 CTCGAGTCGT GAGTGTGAGA TTGTGGGTAG ATATGGCGCC GCCATAACCT 

944 01 TGGCTCACAT TGTCTCTAAA CACCAGGTAG CGGTTCCCGC TGAGATTTAC 
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94451 


TGAAGGACGA 


CTCGCCTTAG 


AACCTAAAAG 


GTAGGGAGTA 


TAAATCGCAG 


94501 


CTCCACTCCA 


CGAAGAGTAG 


TTCCCACAGA 


AAGTCACTTC 


CTCACTATTT 


94551 


TCGATCATCA 


CGGAACCAAG 


ACTATAAATC 


GCTCCTTGTC 


CTTGAGGTAG 


94601 


TAGAGGTGCT 


GAGGTGAACG 


CTAAGTAAGA 


AAAATTAGAG 


AGAGTGAGAG 


94651 


TGGTGTCTCC 


AACGCGGTTC 


GAAATGGCAG 


CGCCAAAACC 


CTCGGTCATA 


94701 


AGGTTGTGAA 


AAGTGAAGTT 


GCAACGGTTG 


CCCATGAAAA 


AAAGATTCCC 


94751 


AGATCGATTT 


ATAAAAACCC 


CAGCATCTTC 


TTGATCATGC 


TTAACGTTGG 


94801 


AAATCCTCAC 


GTCATCTAGA 


AAGATGTAAG 


AAGTTCCTTC 


TGGATAACAG 


94851 


GTAATTTTAG 


GTTCTAAGCT 


TTTATTATTG 


ATAGCACCGT 


TATAACCATC 


94901 


ACTTTCATGA 


AGATATACAA 


CTTGTGCTGC 


TGCAGGGAGA 


GCGAGGAATA 


94951 


AAGCCGAGCA 


GGTAAGAAAA 


TTTCGAAGTA 


TGGTCATGGT 


TTCCTCGTTA 


950C1 


AATCAATAAG 


GTTGAAGCAA 


CTTTAATAAA 


CAAGAAAAAA 


AG.AAGTCAAT 


95051 


AAGAATAGAT 


TATTGTCTAT 


TAATTATTTA 


ACTGTTTTTA 


AAATAAAATT 


95101 


ATAACTAGAA 


ATTATTAAAA 


GAAATCTTTT 


TTGAAGAGGG 


ACAAATGTTA 


95151 


TTTTTTACAG 


TTTGCAAGGA 


AAGCATTCCC 


TATAGCAAAT 


ATTTCCCTAA 


95201 


AAGTATGAGA 


AAACTCCCTA 


GAAGAACTAG 


GGAGTTTTAG 


CAATCTAGAA 


95251 


TCGGAGTTTG 


GTACCAACAT 


CTACATTGTA 


GTTCCTTGAA 


GATCCACGGA 


95301 


GTTCCATAGC 


GTAATGTCCG 


AAGAGCTCAC 


AATTGGAGTT 


GTAGACGTAG 


95351 


TTGTTGCTAC 


CCCTCAGTAA 


AAATGCCTGT 


CTTGAAAGAT 


TGCCACCGCG 


95401 


AATTTTCCAA 


GAGTCTGGGC 


TCATCACAAG 


AGTCGCTGTA 


GATTGGGGAT 


95451 


TGTTACGATA 


GACATCGGAA 


ACAAAGAATC 


CTGAGAGATC 


ATAGGTGTAG 


95501 


GAATCTCCGA 


TATCCCCCTG 


CACGAATTTC 


GCACCCACAG 


GAATCGAGAG 


95551 


GTTAAGCAGC CTTCCAATAC 


TAAAACCACG 


GCCATCACTA 


GAGCTTTCGA 


95601 


AGAAGCTATT 


TTGTGATACA 


TAAACCATTT 


CGACTTTCAT 


CTGTGGAATG 




AAGGfCTTGA 


AAAGAGGATG 


TGGGTTGGAA 


AGAACAAAAG 


GAAGGTCTAG 


95701 


GCCGATACCA 


CCAGCTATAC 


ACTCGTXGCT 


CCAAGAACCT 


TCGGATTCTG 


95751 


GCAATGAGGT 


ATAGTGCGTT 


TCCATACGGT 


TGTCTGAATG 


GCTGAACGAA 
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9 5801 


ACTTGG AC A 1 




GGGAATTTCC 
ovjo/^ri X X X v_>^ 


CTAGGGAATT 


TTTCTATAGC 


n c o c T 

9 5851 


TGA i 1 L AoAA 




TTCCTAATCT 


CAAATAGTTT 


TGGGGTTGTA 


ft C ft A T 

9 5301 






AATAAAGTTC 


CACCGTAGGT 


TCTAGAGTTG 


ft C ft C 1 


mrpptrp/^ TV ppr* A 
1 J. O J. 




TTTGTCTCTA 


GCAAAGAGAT 


GGCAGAACGC 


ft ft A T 

96001 


AAAbG I AAA i 


A f*; n T p n T P T T 

/\V30 X X ^ X X 


TAPrJAGTGTG 


AGCACTTCCA 


CCGATGACGT 


ft ^ ft C T 

95051 


AOLC, 1 CUAvjA 


PPT ATr; APPn 


AAGPPTTTGC 


GATTTTCATC 


TCCAGTCTTA 


96101 


i CjjC- AvjLjAAG i 


TPPTPATPPA 


GGAAAPPPAG 


AAACCTTGTT 


TGTGTTCCAT 


96151 


ACCAGl iUUVj 


CCGATCTCTA 


CAAGCTGTTG 


CAGAGAGCGA 


ATGTCAGTAA 


ft O A 1 

9 62 01 


A A r^TT^r^r^r^ & 
ACjAL iUUt-L-A 


TAGGGTATTG 


CATACTAACG 


CAGATTTTCT 


TTCGGGGCTG 


ft C T 

96251 


GtjAACAAA 1 U 


CTGTTTTGGT* CCAAGTTGCC 


GTGGCCTCTT 


TTGTATTTGT 


ft /^r D ft 1 

95301 


AGL i G 1 A I L.\- 


GTAGTCCAAT 


TAACATTCCA 


TTGTCCTTGG 


AATCCGTATT 


96351 


CTGAAi iAGG 


ATCCTCAGCA 


GGAACAGGGA 


TAAGGCTGCT 


GATGTCAACG 


9 64 01 


TTAG 1 A 1 C AA 


CATCAGCATC 


AACCGTGATT 


TTTAATAGAG 


AGAAGAGCTG 


964 51 


GTCATGGC 1 G 


AACATATGAC 


TTTCATAAAT 


GTTCCCTTCA 


ATATCAATCA 


96501 


GGTTGAGG X i 


PPP APATAPG 


ATCACTTTAT 


TTGAAGCACC 


TTTTGCTGTT 


96551 


AGGCTGACGG 


GCTGCTTAAG 


ACCTAAGGAG 


TCAACATTGA 


TTCCTAGGTT 


96601 


CGTGAI IGiA 


ATACTCCCAG 


CTGTAGTTGA 


TAATGTCGTT 


CCTGAATCCA 


ft ^' C 1 

96551 


TGCCGAGGAG 


AGAACCGGCC 


TCTTGAGAGA 


AGCTCGTGCT 


CTCTAAAGTG 


ft T A 1 

9 6/01 


AG 1 GGG i X i i 


GTAGCAATAA 


CTTTCCTCCG 


GATAGGGAGA 


CTGGCTGCGT 


ft IT T C T 

96/51 


A A T*/** A A O A 

GAATG AAG A i 


TTTAAATTGT 


CAGCAACTTT 


AAGTTCATCT 


GCTGTTAGGG 


A O ft "1 

96801 


TTTCTCGAGA 


AAATAGAATC 


GTTCCTTGAT 


ATGGATTGAG 


AGCTCCCGCA 


A ^ O C T 

9 6851 


GAGGGG 1 J. A 1 


TTATCTTCAA 


TACGTCTGAT 


GAGGTTCCTT 


CTGAAGTGAT 


ft ft ft T 

y by oi 


GGo A i G A 1 AL) 


AAGAAAATTG 


TATGATTTTT 


AGCAGCCCGT 


AATTCCGTGA 


Q C Q 1 

y by Di 


A J. 1 1 GGGG 1 1 


ACTTCCTATG 


TTGATCGCAT 


TACGTTTAGG 


AGTATCGGTA 


97001 


CTTCCGGTTG 


TTGTAAGGGT 


ATTTCTTACA 


AAGGTAATGT 


TTCCTGTCTC 


97051 


TGCAGAAAGA 


CTGAGCTCTC 


CTGAGGCATC 


GATGCTGATA 


GCACCCCCCT 


97101 


TAGGAGTTGC 


TGATGAGACA 


TTATTTCGTA 


GAAACTCTGT 


AAAGCCTCCA 
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97151 GAGGAAAGGG 

97201 TACGTTTGAA 

97251 GAGATCCACT 

97301 GCATTCCCTT 

97 351 TTCATCATCG 

974 01 TAACATAACC 

97451 GCTCCTCCCT 

97 501 AGTGCCAGTT 

97 551 TAGAAAAATT 

97 601 CCTCCTGCTG 

97 651 AGAGGAATCA 

97 701 TTGCTGTTGT 

977 51 CCAAACGTTA 

97 801 TTGCTTAAAA 

97 851 AGAAGACATC 

97 901 AACGTTCCTG 

97 951 TTCCTCGTTA 

98001 AAACTAAAAC 

98051 AGAGAACTAT 

98101 GATGACAAAA 

98151 ATTTTATATG 

98201 CGCAATTTTC 

98251 TGTTCCACTG 

98301 CTTTGACGGA 

98351 CTGGAATAGA 

98401 GATTCGGCAG 

98451 AAGCTTTGCC 



CTAGCTTTTT AGCATGGATG 
GCAAAGATCA GAGTCTTATT 
CGCCTTGGTG TTGCAGATCG 
CAAAATATAG AAATTTGTTG 
ATAGCGCCTC CTGACGTAGA 
TGTGTTATTT GCTATGCGAG 
TTGTTGATGA AGAGTTGTTA 
AAAAGGAAAG ACGCTCCTTT 
CCCAGCAACT ACAAGTTTAC 
AGGAAAGCGT TCCCTGACCT 
AAACTCAGTA AGGAAAACCC 
AGATGCAGCA GCACCTGCAT 
AGCTATGACC GTTCCCCAAG 
CAACTGTCAG ATAAGGGAGT 
TCCTGTTAGA GAATATGTTG 
AATCGATATT TCCATTAAAG 
GCTAGTGACT GTAGGTGACA 
CCAAGGAATC GAAGTCTTCA 
TCGCATCAAT ATAGAAACAA 
CAGCTGTCAA GAATTTTTAT 
ACGCAAGTAA GAATTTAATA 
CTGGTTAGTG CTCTCTTCGA 
TTTTTGCTGC AACTGCTGAA 
AGTACTAACA CAGGCACCTA 
CTATACTCTG ACAGGAGATA 
CTTTAACGAA GGGTTGTTTT 
GGTAAGGGGT ACTCACTTTC 
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GCGCCACCGC TTGTTTCTGC 
GTTAGAGATT ATCAGTTCAG 
CACCGCCAGT AGTTTTCGCT 
TTCGATAGTA TCGACGTGCC 
CGCTATGTTA GATAGGAATC 
CGCCTGCTGT AGTAGCAATT 
CTAAAAAGAG CATCTCCAGA 
GATAGCTCCA CCATCTGCAG 
GAATATTTTC TAAATTTACG 
GTAGTAACCG TTGTGCTAGG 
TGAGAAGGTA AGATTCTTAT 
GAGTGCCAGC ATCTATAAAG 
AAGGTAAGAT TGTCCGTGGT 
GCCTTTTCCA GGCTCGTAAA 
TGGCTGAAGT TTTTGGAGTA 
CTATCATCAG GTGATAAAAG 
TGAGAAAGCT AACACGGAGG 
TGGTAATGCT TTTGTTTTTT 
AATAAGTAAA TCAAGTTAAA 
CTTGACTCTC TGAGTTTTCT 
ATAAAGTGGG TTTATGAAAT 
CATTGGCATG TTTTACTAGT 
AATATAGGCC CCTCTGATAG 
TACTCCTAAA AATACGACTA 
TAACTCTGCA AAACCTTGGG 
TCTGACACTA CGGAATCTTT 
TTTTTTAAAT ATTAAGTCTA 
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98501 GTGCTGAAGG CGCAGCACTT TCTGTTACAA CTGATAAAAA TCTGTCGCTA 

98551 ACAGGATTTT CGAGTCTTAC TTTCTTAGCG GCCCCATCAT CGGTAATCAC 

98601 AACCCCCTCA GGAAAAGGTG CAGTTAAATG TGGAGGGGAT CTTACATTTG 

98651 ATAACAATGG AACTATTTTA TTTAAACAAG ATTACTGTGA GGAAAATGGC 

98701 GGAGCCATTT CTACCAAGAA TCTTTCTTTG AAAAACAGCA CGGGATCGAT 

98751 TTCTTTTGAA GGGAATAAAT CGAGCGCAAC AGGGAAAAAA GGTGGGGCTA 

98801 TTTGTGCTAC TGGTACTGTA GATATTACAA ATAATACGGC TCCTACCCTC 

98851* TTCTCGAACA ATATTGCTGA AGCTGCAGGT GGAGCTATAA ATAGCACAGG 

98901 AAACTGTACA ATTACAGGGA ATACGTCTCT TGTATTTTCT GAAAATAGTG 

98951 TGACAGCGAC CGCAGGAAAT GGAGGAGCTC TTTCTGGAGA TGCCGATGTT 

99001 ACCATATCTG GGAATCAGAG TGTAACTTTC TCAGGAAACC AAGCTGTAGC 

99051 TAATGGCGGA GCCATTTATG CTAAGAAGCT TACACTGGCT TCCGGGGGGG 

99101 GGGGGGTATC TCCTTTTCTA ACAATATAGT CCAAGGTACC ACTGCAGGTA 

99151 ATGGTGGAGC CATTTCTATA CTGGCAGCTG GAGAGTGTAG TCTTTCAGCA 

99201 GAAGCAGGGG ACATTACCTT CAATGGGAAT GCCATTGTTG CAACTACACC 

99251 ACAAACTACA AAAAGAAATT CTATTGACAT AGGATCTACT GCAAAGATCA 

99301 CGAATTTACG TGCAATATCT GGGCATAGCA TCTTTTTCTA CGATCCGATT 

99351 ACTGCTAATA CGGCTGCGGA TTCTACAGAT ACTTTAAATC TCAATAAGGC 

994 01 TGATGCAGGT AATAGTACAG ATTATAGTGG GTCGATTGTT TTTTCTGGTG 

994 51 AAAAGCTCTC TGAAGATGAA GCAAAAGTTG CAGACAACCT CACTTCTACG 

99501 CTGAAGCAGC CTGTAACTCT AACTGCAGGA AATTTAGTAC TTAAACGTGG 

99551 TGTCACTCTC GATACGAAAG GCTTTACTCA GACCGCGGGT TCCTCTGTTA 

99601 TTATGGATGC GGGCACAACG TTAAAAGCAA GTACAGAGGA GGTCACTTTA 

99651 ACAGGTCTTT CCATTCCTGT AGACTCTTTA GGCGAGGGTA AGAAAGTTGT 

99701 AATTGCTGCT TCTGCAGCAA GTAAAAATGT AGCCCTTAGT GGTCCGATTC 

99751 TTCTTTTGPA TAACCAAGGG AATGCTTATG AAAATCACGA CTTAGGAAAA 

99801 ACTCAAGACT TTTCATTTGT GCAGCTCTCT GCTCTGGGTA CTGCAACAAC 
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9 98 51 TACAGATGTT L'CAUUtiUTTC UTAUAUTAUC i v^mv^ i ^ i ovjVj i 

99901 ATCAAGGTAC TTGGGGAATG ACTTGGGTTG ATGATACCGC AAGCACTCCA 

99951 AAGACTAAGA CAGCGACATT AGCTTGGACC AATACAGGCT ACCTTCCGAA 

100001 TCCTGAGCGT CAAGGACCTT TAGTTCCTAA TAGCCTTTGG GGATCTTTTT 

100051 CAGACATCCA AGCGATTCAA GGTGTCATAG AGAGAAGTGC TTTGACTCTT 

100101 TGTTCAGATC GAGGCTTCTG GGCTGCGGGA GTCGCCAATT TCTTAGATAA 

100151 AGATAAGAAA GGGGAAAAAC GCAAATACCG TCATAAATCT GGTGGATATG 

100201 CTATCGGAGG TGCAGCGCAA ACTTGTTCTG AAAACTTAAT TAGCTTTGCC 

100251 TTTTGCCAAC TCTTTGGTAG CGATAAAGAT TTCTTAGTCG CTAAAAATCA 

100301 TACTGATACC TATGCAGGAG CCTTCTATAT CCAACACATT ACAGAATGTA 

1003 51 GTGGGTTCAT AGGTTGTCTC TTAGATAAAC TTCCTGGCTC TTGGAGTCAT 
10 04 01 AAACCCCTCG TTTTAGAAGG GCAGCTCGCT TATAGCCACG TCAGTAATGA 

1004 51 TCTGAAGACA AAGTATACTG CGTATCCTGA GGTGAAAGGT TCTTGGGGGA 
100501 ATAATGCTTT TAACATGATG TTGGGAGCTT CTTCTCATTC TTATCCTGAA 
100551 TACCTGCATT GTTTTGATAC CTATGCTCCA TACATCAAAC TGAATCTGAC 
100601 CTATATACGT CAGGACAGCT TCTCGGAGAA AGGTACAGAA GG AAGATCTT 
100651 TTGATGACAG CAACCTCTTC AATTTATCTT TGCCTATAGG GGTGAAGTTT 
100701 GAGAAGTTCT CTGATTGTAA TGACTTTTCT TATGATCTGA CTTTATCCTA 
100751 TGTTCCTGAT CTTATCCGCA ATGATCCCAA ATGCACTACA GCACTTGTAA 
100801 TCAGCGGAGC CTCTTGGGAA ACTTATGCCA ATAACTTAGC ACGACAGGCC 
100851 TTGCAAGTGC GTGCAGGCAG TCACTACGCC TTCTCTCCTA TGTTTGAAGT 
100901 GCTCGGCCAG TTTGTCTTTG AAGTTCGTGG ATCCTCACGG ATTTATAATG 
100951 TAGATCTTGG GGGTAAGTTC CAATTCTAGG AGCGTCTCTC ATGTCTCAGA 
101001 AATTCTGAGA GAGATCGCAT TTAGGATTTT CTTAAACACG ACTCACCTTG 
101051 TTTTTGAACC AGGAGAGATC GGGGATTAAA AAGGCAAGAG GGCAGAGTTC 
101101 GTGAGGTCAC GTACTCTGCC TTTCTTGTTA CAAACACGTT TTAAAATTAA 
101151 GGAAATTTTT TAATAGAAAC CCGTTCTTTA AAATACGTTT CTTTAATTCT 
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X\J 1 Z U 1 




GATAATTCAC 


TATTTTTAGA 


TCCTAAATTT 


TAAGTGGTTT 


J. U -L ^ -J -L 


TTGTTATGCT 


TCTTATAGAG 


AATAGCTGCA 


AAGATTAGAG 


TTGCAGAGAC 


J. U -I. -3 U J. 


nQTACGTCTC 


TTTCTTTTTT 


AAGGGAAGGG 


GTGTTGTTAC 


ACCCATCCTA 


X U J. J -) J- 


AGATTTGTGA 


GATTCCCCTC 


AGGCAGTAAC 


TTTTACAATC 


GTACTTTATG 


X U X fl V i 


TTTTGATCTA 


GCTGTTTTCT 


TGTCTTTAAT 


TTATTCAACC 


ATCGAGAAGA 


X U X fi -I J. 


GAGATCCATG 


AGTGGAAATG 


TATTTTATTA 


GGATCATCTC 


TAAGGATGGA 


101 sm 

X U X 3 U X 


AATGATGAGC 


CCATTCCAAC 


AACCTGAGCA 


ATGTCATTTT 


GATGTTGTGG 


1 n 1 SSI 
X U X _) X 


nAAGTTTCTT 


ACGTCCTGAA 


AGTCTTACAC 


GAGCACGCTC 


TGATTTTGAA 


1 01 A01 

X U X D U X 


nAAGGAAGAA 


TTGTCTATGA 


GCAGATGCGA 


GTTGTCGAAG 


ATGCTGCTAT 


1 01 A S 1 

X U X 0 D X 


TPf^T A ATPTP 


ATAAAAAAGC 


AAACAGAAGC 


AGGTCTTATC 


TTTTTTACTG 


1 017 0 1 
X U X / U X 


ATnnnr; A ATT 


PPGTAGGTAT 


AGTTGGGATT 


TCGACTTTAT 


GTGGGGATTC 


1 n 1 7 s 1 

X U X ' D 1 


p ATnnpHTnr; 


ATPGTPGCAG 


GGACTCTAAT 


GACCCTGAAA 


TTGGAGTGTA 


1 01 R 0 1 

X U X O U X 


TPTT A A AGAT 


AAAATCTCCG 


TATCAAAACA 


tCCGTTTATA 


GAACATTTCG 


1 01 PlSI 

X U X O _J X 


AGTTTGTPAA 


AACTTTTGAG 


AAGGGAAATG 


CAAAAGCAAA 


ACAAACGATT 


1 01 Q n 1 

X U X ^ U X 


PPTTPTPPAT 


CACAATTTTT 


CCATGAGATG 


ATTTTTGCTC 


CTAATCTGAA 


1 01 Q S 1 

X U X _> X 


a A AT APTPGG 


AAGTTTTATC 

c^XWJ X X X X X V» 


CTACGAATCA 


AGAGCTAATT 


GATGATATTG 


1 09 001 

X U ^ w \J X 


TPTTTTATTA 

1 X X n X X r\ 


TCGCCAAGTC 


ATCCAAGATC 


TTTATGCTGC 


AGGTTGTCGT 


1 09 n SI 

X U ^ U -J X 


A ATTTGP AGT 


TGGACGATTG 


TGCTTGGTGT 


CGCCTCTTGG 


ATATACGAGC 


102101 


GPPTTCTTGG 


TATGGTGTTG 


ATTCTCATGA 


CAGGTTGCAG 


GAAATTTTAG 


1 091 SI 

X U ^ J. ^ J. 


A AP AGTTTTT 

f\r\\^r\\J X X X X X 


ATGGATCCAT 


AATTTAGTGA 


TGAAGGATAG 


ACCCGAGGAT 


1 09901 

X v ^ ^ U J. 


PTTTTTGTAlA 


GTCTGCATGT 


CTGTCGTGGT 


GATTATCAGG 


CCGAGTTTTT 


1 099S1 


PTPTAGACGA 


GCTTATGATT 


CTATAGAGGA 


GCCTTTATTT 


GCTAAGACCG 


1 09101 
X V ^ J u X 


ATGTGGATAG 


TTATCACTAT 


TATTGGGCTC 


TTGATGATAA 


GTATTCAGGA 


102351 


GGTGCTGAGC 


CTTTAGCTTA 


CGTCTCTGGA 


GAGAAACACG 


TCTGCTTGGG 


102401 


ATTGATCTCC 


AGCAACCATT 


CTTGTATTGA 


AGATCGAGAT 


GCTVjlbbl 1 1 


102451 


CTCGTATTTA 


TGAAGCTGCG 


AGCTACATTC 


CCTTAGAGAG 


ACTTTCTTTG 


102501 


AGCCCGCAAT 


GTGGGTTTGC 


TTCTTGTGAG 


GGAGACCATA 


GAATGACTGA 
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102551 AGAAGAACAG TGGAAGAAGA TCGCCTTTGT GAAAGAGATT GCTAAAGAGA 

102601 TCTGGGGATA AAGAATCCGG AGTTTTTATC GACTCTAAGA GTTTTCGGAT 

102651 CATAGAAAAC ATTTAAATAT TCAAGAGTCT TTGGCTATTG GATCATAGAC 

102701 AGTCTTAGTA TACTAAAAAG TCTTTGGATT CTAAGACGGG CAGAGTTCGT 

1027 51 GAGATCACGT ACTCTGCCCA TTCTTTCTTG TGATCTAGCG ACTTCTTTGA 

102801 ATCTTCGACC TCTTGTAATC TGGGATTTTT TCTAGTTCTT AGATTCCTCT 

102851 GATCTTTCGA CTTCTCCTCG TCTAAACAAG GCGCATTGTC TTTGAGAAGT 

102 901 CCCTAGATAC ACTCAGGATC TCTTAGAATT TCTAAGGGAT CAGGAACGCT 
102951 TTTAGAACTG GAACTTACCT CCAAGATCTG CATTGTAGCT GCGTGAAGAT 

103 001 CCACGAATTT CCATAGATAG GTTACTTGTG ACCTCAAGAT TTGGAGAGAA 
103051 GGCATAAAAG ATCCCTGCTC TTCCGATACC AGCTTGTCTT GAGAGATTCG 
103101 TTCCTGTAGT TTTCCACGAG GTATTGTTGA TTAGGAGAGC TGTCGTGCAG 
103151 TCAGGATTCT TACGATAGAC ATCGGCAACG TAGATGACAG TAGCTTCGTA 
103201 AGACGCACGC TCGTTTCTCG AGAATCTCTC GAAGGTAATT CCAATAGGCA 
103251 CAGAGACGTT AATTAAATCA CCGCTATCGA AAGATCGTAC CAAGGTAGTA 
1033 01 TTACGTTCTT TGAAGCTATC TTGGTGTATG TACGAAGCTT CTACTTTGAT 

1033 51 GAAAGGAAAA TACGCGTGGA AGAGACCCTC ATGGCTTAAA GCAGTGTGTG 

1034 01 GTAGGGAGCT CGCAAGTTCC AGAGCGCAAC CGTCATTATA CCACGAGCTC 
1034 51 TCTCCCTTTG GTGCTTGGGT GTAATAGGTT TTCATAGTAT TTTTACTATA 
103 501 GATATAGCTG ATCTGAGCAT CAAAGAGGAC AGGCTGCTCA CTTTCAGATC 
103551 CAGGAAGGTA GCGTAACAAG CTTGGAGAAG ACAAGGTCGC TAGATGCTGG 
103601 AGATGGAGAG AAGCTGCATA GGCAGAAGCT CTATTTTTAT TTATAAAGTG 
103651 ATCTCTATCT TTCCCGAATA ATTGGCAGAA GGCTGCAGTG ATAAGATTAT 
103701 CAGAAGCTAA TGTTGTAGTC GCTCCTACAA CATAACCTGC ACTTATGTGG 
103751 CGAAAACCTT TATTTATCTT CGTGCTATCT TTATGGAAGA AGTTCGAGAT 
103801 CCCTTCACAC CAGATGCCGC GAGTTTCTTG AGATTGGCGT ACTTTAGTGG 
103851 CTACAAGCTG TTGTATGGAG CGCACATCAA CAAAGGATCC CCATAGCGTG 
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103901 


TTAGCAACTA 


AGGTTCCACG 


ACGCTCAGGA 


TTCGGATTGT 


ATCCTGTTTT 


103951 


TGTCCAGGTA 


AGAGTCGCTG 


CTTTGGATTT 


AGTCGCAGTA 


TCCTCTTGCC 


104001 


AAGATAATGC 


CCAATTCCCT 


TGGTATCCCC 


AATGGATAGG 


ATTTTTTTCT 


104051 


AGGGGATCAG 


CAGCTAAGTC 


TGTGATGTGA 


ATATTCGCGG 


KjKj 1 Lb 1 uAvjU 


104101 


AGTAAGAGTG 


AGACAAGAAA 


AGACTTGAGG 


GTTATTCCAA 


GAGACATC 1 1 


104151 


CGTAGACATT 


TCCAGAAGGA 


TCTACAAGAG 


AGAGCGATCC 


Ab A i AAAU I \j 


104201 


ACTGTCTGAC 


TTGCTTGTGT 


TGCTTTTAGC 


GTAGCCTTCT 




104251 


TAAGGAATCT 


ACATTGAGAA 


CAAGATTATT 


GATAGTGATC 


C C ATC ACj C Cj (j 


104301 


TTTCTAATGT 


GGTCCCTGCA 


TCCATGAGGA 


GGGTAGAGCC 


CGGAGAl IGL 


104351 


GAAAAGGACT 


TAGCAACTAG 


AGTGACTCCT 


GATTTAAGAG 


AGAGTTGCCC 


104401 


TCCCGCAAGA 


GTTAGAGGTT 


GCTGAATTGT 


AGATTTGAGA 


TTATCAGCTT 


104451 


CTGCAGCTTC 


TGCTTCCGAG 


AGCTTCTCTC 




GATGGTTCCT 


104501 


TGATATGCAG 


GATTCCCTGC 


AAGGTCAGGA 


CCATTTAAGT 


TTAGAGCATC 


104551 


TGAGAGAGCT 


GCAGTGATGC 


TAGTTGTTAT 


AGGATCATAG 


AAGTAGATAG 


104601 


TATTGCCTTG 


AGAGGCTCGC 


AGCTGTACAA 


TCTTAGCATT 


GGTGTTl LCCjj 


104651 


ATGTTAATAG 


AATTTCTGGT 


AGTGGTCTGA 


CTCGAAGAAG 


CTCCTTTGAL 


104701 


TACTGTGTTT 


CCTTCAAAAG 


TGATGTCTCC 


ACCAAGAGCC 


GAAACjAC TLA 


104751 


AAGATCCAGA 


GTCAGCAATC 


GCAATTGCTC 


CTCCTAAGGG 


AGCTbLAGi A 


104801 


TCTATAGCAG 


AGTTGTTTTT 


AAAAAGCGTA 


GGTCCTCCAG 


AAGAAAGAAL 


104851 


TAGATTGTCA 


GTATAAATCG 


CCCCACCACT 


AGTAATTGCT 


/"< m TV rrimm/**^nn 7v 

GTAITICLI A 


104901 


TAAAGTTCAG 


TTCCCCGTTG 


TCTGATAGAG 


TTAAGACTGG 


m m rp/^ O ^ ^ O ^ 


104951 


GATGTACTAC 


TACAGTAAAT 


GGCTCCCCCT 


GTAGCTGAGG 


1 ICjLCjjb 1 LAL 


105001 


ACTATTGTTT 


ATAAAGCTAA 


TTGCTTTGTT 


GCTGCTAATA 


AAAL i bt, 1 AG 


105051 


CTTCCGTGTA 


AATGGCTCCG 


CCATTGTTCG 


CCGCGGTATT 


1 IV-AGAAAAi 


105101 


GATGCTGAGT 


TTAACGTATT 


GTTAATTGTA 


ATCCCTCCCG 


TGGAATAGAG 


105151 


GGCACCCCCT 


TTTTGCGTTG 


CTTTGTTTTT 


GGCAAACGTT 


AGGTTGGGGT 


105201 


TTAGCGATAG 


ACTGATAGAG 


CTGCCTTGGA 


GGGCGCCTCC 


ATTGTCATTA 
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105251 GAAAAGTTTT GGCCAAAGTA GCAACTATAG TTCGACTGAA TAGAACAAGC 
105301 TCCTGTGGAC TfGATGGCTC CTGTTCCTGT GGTAGCATTC GTGGTTTGTA 
1053 51 TTAGTGACAA ATAGGAGAAT CCTGAAAAGG AGAGAAGCTT ATTTGCAGCT 
105401 GTATTGGTAA AGGTACAGTT CGCTCCCGCA TCGATATTTT GTAGGAGAAA 
105451 TTGGTAGCCG TGGCCTTGGA AAGAAAGATT CCCAGTAGTT TCTTTAAAGC 
105501 AGGAAGCGGT TAGAGCTGTC GGAGATCCTG CATTGGTGAT TGAGACATCC 
105551 CCTGTTAGAT TATAGATAGT TCCATCTGCA TTTGTTGTTT GGGCTGGAGG 
105601 AGTGTAGGTT CCTGGTCCAG AGAAGCTATT GGTAGGTCCT AGATTGATTT 
105651 CAACAACAGC AGCAAACGCA GAGAAATTTA GTGACAAGGG AAGTGCTAAA 
105701 GATGACGAGA TTAAAAACCA ATGAAGAGAG GATTTCATGT AGAGGGCTAT 
105751 AGGTGGTTTA ACAAATTATT TCACCACATA CTGCAATAAA TTAAAGAAAG 
105801 CAAGAGGAAA GGAGAGACTA GTAAGTTAAG AATCTACAGG GTTTTTATAA 
105851 GAATTCCTCC CTAAAAGTTT AGGGAGGAAA GTAGGAACTA GAATGAGTAT 
105901 CTTAGCCCAC AATCTACATT GTAGATGTGT GCTGAGCCAC GAAGCTCATA 
105951 AGCAGCTTCC CCAGAGAGTT CTACATGAGG GGAGAGAGTC AGATGGCTTC 
106001 CAGCACTTGC TAAGAAGGCT TGTCGTGCGA GGTTTTTACA TAGCGAAGTC 
106051 CAAGAGGCTC CACTGACCAT TAGAGAAGTA CGCG.AACGGG GATTTTTACG 
106101 ATACACATCA CCAATGTAGG CTAGAGAAAT CTCGAAATTA TTTTTTTCAT 
106151 CTTCGGAGAT TTTTTCTAAC CGAATGCCGA CAGGGATAGA GCAGTTCACT 
106201 AGGTCTCCAT CATCAAAAGC ACGGGCTTCA GCGCCACTCT CTTTAAAGTT 
106251 TTGTTGGCGG CTGTAGACTG CCTGGAACTT TAAGAAGGGG AAATATCCCT 
106301 GGAAGAACGG TGCTTCTTTA GGGAGATATA GAGCCAGAGA TCCTCCGAGC 
106351 TCTAGAGCCC CAGAGTTATT GGTCCAAGAG CCTTGAGCTT CAGGATAGGA 
106401 AGTATAGCGA GTATCCATAT CATTTTTAGT GTAGCTGTAG CTTAGCTGGG 
106451 CATTCAAAAT GAGAGGAATA TCTTTCAGCA TGTCGGTGAT ACTTCCAAAT 
106501 GAGGGCATGG GAAGTCCTCC TAGGAATGCT CGATGTTGCA GGTATAGCGA 
106551 CGCTAAATAG TTATGAGAGG TATTTTCAAC TATAAACAGG TCTTTATCTT 
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106601 TACCGAAGAG CTGGCAGAAA GCTACACTGA AGATATTTTC AGAAAAATCT 

106651 TCAGCACTTC CTCCAACAAT ATAGCCGTAG CTTTTATGTC GGAATGCTTG 

10 6701 GTTAGTTCCT GATTTATCCT TATGGAAGAA ATTCGCAGTT CCTGATGCCC 

106751 AGAGTCCTCG TTGCTGATAG ATACTATTCG CTTGAGATGT CATGATCTGC 

106801 TGTAGAGTGC GAATGTCAGT AAAGGATGCC CATAATGAAT CGGGAACTAC 

106851 GGAAGCTCTA CGCTCAGGAT TAGGGTTGTA GCCCGTAGTT ACCCAAGTCA 

106901 TAGTTCCTGA TTTTGCAGTT GATGTGTCTG CCCAAGTGGC TTCCCAATGT 

106951 CCCTGATACC CGTAATGAGG TTCTGGAGTT TGTACTGGAG AAGTGAGAAG 

107 001 CGCATCGATA TAAATATCGC TAGCAGCAGT AGCAGCAGTG AATACCACCA 

107 051 AAGGCTGCGT GAAGGCTTGG TTTATCGTAT GGCTTTCATA AAAATTGCCG 

107101 CTACTATCTT GGAAAACAAG AGGAGAGGTT AGAGTTATAG TTTTGTTGGC 

107151 TCCTGCTGTT TCAATGGACA CACTCTTATT TCCCTCTAAG GCAGAAAGAT 

107 201 CAACGACAAG TTTGGTAAGA CTGATAGCTT CAGTATCTGC TTTGAGCTTT 

107 251 GTTCCTGGTT GCATGAGGAG TGTAGAGCCT TCAGTCTGTG TGAAACCATT 

107 301 GACATCTAAC TCGACATTTC CTTTGAGTGC TAAGGTTCCA GAGGCTAGAG 

107351 CCAATGGTTG CTTTAATATA GATGTGAAGT TATCAGCAGC TTTCGCTTCA 

107 401 TCTGCAGAGA GCTTTTCCCC AGAAAATACA ATCGTTCCTG AATAATCTAA 

1074 51 AGGCGAGTTG CTATCCGGTT GGTTGATGGT CAGAACGTCT GAAGCTCCTG 

107 501 TGGTGTTAGA TGCAATCGGA TCATAGAAAT AGATAGATTG GCCTTGGGCT 

107 551 GCCCTTAAGT TCGTAATTTT TGCTGACGAT CCCAGGTAGA TAGCATTCCG 

107 601 TGTCGATGTT GGCGCGGAGG TTGAGGTTAG AGTGTTGCCA AGGAACGTGA 

107 651 TGTCTCCTTG ATTTGCAGAG AGACTTAAAG ATCCAGAGTC GGCAATTGCA 

107701 ATAGCGCCGC CCTTGCCTGC AGCTGTGTTC CCGCATCTAT TATTTGAAAA 

1077 51 TAGGGTAGGG CCAGCAGCGG AAAGATCTAG ACCATGGGCA CAGATTGCTC 

107 801 CGCCTTGAGT TACTGAAGAG TTCTCGGCGA AGGTCAGACT TTTATTTCCA 

107 851 GAGATAGTAA GAGTAGGAGT CTCTCCTGTT TTTTCACAAT AAATGGCCCC 

107 901 GCCCTTGCCT GCAGCATCTG TTGCAGTGTT TCCAGAGAAG AAAAGGGAGC 
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107 951 TATTTTGAGT AATCGAGGAG CTGGCTTCAA AGCCCAGAGC CCCACCCCCA 

108001 GTTTCTCCTT TATTATTCAT AAAGACTAAC TGGCCGGTGT TTCCTGAAAT 

108051 ACTTGCAGCC GCAGAGCTAT AGATCGCTCC ACCTAATTTT TTTGCGCTAT 

108101 TACTAGTGAA GGTTATAGAA GAGGTATTCC CAGAAATAGA AAGAGTTTTT. 

108151 GTGGTGATCG CTCCGCCATT GTTATTAGCT TCATTGGAGA CGTTTTGGCT 

108201 AAAGAGAATC GTTCCATTAT CGGTAAGATT TAAGGCTCCT GCAGAACTTA 

108251 AAGTACTTTT TCCTGAAGCA ACTGTAGTTC CAGGAGCTGC AATGAAGGAA 

108301 AGGTTAGAAA ATCCTGTGAA TGTTAGGGCT TTATCAGCAG TTGTGCTTGC 

108351 CGCAGCTCCT GCATTCGAAC CCGCATCTAC CGTGTTGAAT GAAAATGAGT 

108401 ATCCCTTTCC AGTAAATGTC AGATCACCCG TAGTTTCTGT AAAGCAGCAG 

1084 51 CCTGTTAATG CTGTGCCTTT CCCAGCATCG TTTATATAGA CATTTCCTGA 
108501 TAAGACATAG TTCGTTCCAT TGGCATCTGC TGTAGATTTT GGAGTAAATG 

1085 51 TAGAGCCGCC CGCTCCATCA AAGCTATCTG TAGGGGATAA AGAAGCATCT 
108601 GCTCCGTAAG TTGCAATGCT CAATAGAATG GGAGTGACAA GAGTCGAAGA 
108651 GATCAGGAGT TTGTGCAAGG GTATTTTCAT AGAAAGATGC TTGGGTTCAA 
108701 TTAATTAACA CGTTTTCGAT AATCTAGAAA CAAAACTTAG AGCCTAGGTT 
108751 TGTATTATAA TTTCGTGAAG AACTTCGTAC TTCAAAAGCG AATTGACCGA 
108801 AGATTTCCAT GTGGGGGTTC ACTTGGAAAT GGTTCGCAGC ACGAACAGAA 
108851 AAACCTTGTC GTGCGAGGTT GGTACCATAG GCCATCCAGT TAGCATCGCT 
1 089 01 AGCTATTAGG GAAGTTTGAC ATTTAGGATT GCGTCGGTAA GCATCGAGTA 
108951 TATACATAAG AGTAAGATCG TAAGTTCCCT TTTCTGATTT TGAGTCTCTT 
109001 TCGAAGGTGA CGCCTATAGG AATCTCTACG TTGATAAGCT CGCTTTTATT 
109051 GAAAGCGCGT CCTTCAGCAT GACGCTCGTA GAAGTCTTGC TGATGCGCAT 
109101 AGATATACTG TACTTTGACA AAAGGTTCGA CTTCTTTCAG AAGATACGGA 
109151 ACGGAAATAA CAAAAGGCAG GCTAGCTCCA AGATCTGCAC AGAAGGCATC 
109201 GTTTCTCCAA GAACCCTTGA TGATAGAGTT ATCGGTATAA TATGTCTTCA 
109251 TGTGGTTGTC TGTATGGAGA TAACTGAATT TAGCATCGAA CGATAAAGGA 
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1093 01 ATGATCTGGG AGATCTCAGA GAGCACCCAG GGAGCTCGGG TTGCTTTTCC 

1093 51 CCAGAGGAAA TTGGCGATGT CGAAGAGCCC TTCTGTATGG TGGAAATACA 

109401 AAGAGGCACC GTAAGTATCT CCGTGGTTCT TACCTGTAAT ATGATTGCGA 

109451 TCTCTAGCAA AGAGCTGGCA GAAGGCAAAA GTAAGCTGAT CCTCGGCAGG 

109501 AGTTGTTGCT GTGATCCCTA GTGCATAACC CCCGCTGATA TGGCGGAAAC 

109551 CATGGCGGGT GGGCATAGAA TCTCTATAGA AGAAATTCGC AATTCCTGAA 

109601 AGCCATAGCT CACGCTCAAA AGGCTCCCCA CTGGACTTGG TTTCTATAAG 

109551 CTGATTGATC GAGCGTATAT CTATAAAGTT TCCCCATAAG CTATTTAGAG 

1097 01 GGAGATTACT TTTTCTCTCA GGACTAGGAA TGTATCCTGT ACGGGTCCAG 

109751 TTGATGCTTC CTATTTTTGA GGATGTTGCA TTTGCCCAAG ACAACTGCCA 

109801 GTTTCCTTGA TACCCGTAGT GGGTTTCAGG TTCTTGAAGA GTCAGGGTAG 

109 851 AAAGAGCTCC CAGAGTAATC GTTCCGTTGG CTCCTGCGGT GGTAAGTTCA 

109901 AGAAGAGGAT AGGTACTAGC ACTTTTTAAG TTATGATTCT CATAGAATGA 

109951 CCCTTCCGTG TCAATAAGCG CAATCGTTCC CGATAGGCTG ATATTTTTAT 

110001 CTGCAGCTTC TGTTTTTAAA GCTGCCTTGT TGGTTCCATC TAAAGAGGAG 

110051 AGATTTACTG CTAAGCCATT AAGCGAAAGA TTTGCCTCTT TAGCACTAAG 

110101 TGTAGTCCCC CCATCCATTA AGATGCGGGA TCCTGGACTT TGAGTCAGAT 

110151 CCTTGAAAGT TACGGTGACT CCATCACGAA GTACAAGATC TCCCCGCGCT 

110201 AATACTGCAG GTTGTCGGAT AGTAGAGGTG ACGTTTGCAG CGATTGCTTT 

110251 TTCTGTAGGG GAAAGCTTTT CTCCAGAAAA GACAATCGCA CCCCCATACT 

1103 01 CGATCTCACT GTTCGCATCT GCTAAGTTTA AGTTCAATGT GTCGGTAGAA 

110351 GCTGCGGTTC CTGGATTTGT GATGGGATCA TAGAAATAGA TAGATTGCCC 

110401 CGTAGCAGCT CGTATCGATG TGACTTTAGC GGTATCAATG ATATTTATTG 

110451 CGTTTCTTGT ACTTGTGCTT CCGTTGGTGA CTTGGTTGTT ATTGAAGGTA 

110501 ATATCTCCAG AAGTAGCAGA GAGAGCGAGT TCCCCAGCAG ATGCTATATT 

110551 GATCGCTCCT CCTCCTCCCT GACCGGCGCT ACTTCCTGAG ATATTACTTT 

110601 GAAATAGAGT AGGACCTCCA GCGGAAATAC TGACCTTGAG TCCAGAGATG 
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110651 GCTCCGCCAT ATGTCAATGC TGTATTATTT GTGAAAGAGA GGTTTTTGTT 

1107 01 CCCAGTAAGA GTCACTGTTT TATCTGTCGT AGTGCAACAA ATAGCCCCGC 

110751 CCTGAGCTTG AGCGGCTTCC CAAGCACTAT TGCCGTCAAA GATCACTTGA 

110801 AAGTTATCTG TAATCGAACA GTTGTCAGTG CTGTACAGAG CACCGCCAGA 

110851 TCCTTTCGCT AGGTTTTGAG AGAAGGAAAC TATCCCAGGG CTGTTCTCGA 

110901 TAGTTATAGT TCCTGTAGCG TAAACTACAC CGCCTTGCTT CCCTGTGAAG 

110951 GCTTGGTTTC TCGAAAAGCT CGCAAACTGA GATGTCCCTG ATAATAAGAA 

111001 GTTTTTCGTA TTGATAACAC CGCCGTTATC TGACGAGAAG TTCTGAGTAA 

111051 ATATAATTTG GGAATTGCCA GTTAGAGATA GATTCCCCAC AGATTTTAAA 

111101 GCACATTGTC CAGTAGGAGA GAGAAGAAGA GAGGGACAAG AGATAATAGA 

111151 GAGTCTAGAA AAATCATTAA AGAGAAGATT CTTATCTGCT GCTGAGGTAC 

111201 TGGCTACAGT TCCAGCGCTA GAGCCCGCAT TGATAAATGC AAACTTCAGT 

111251 GCATGTTGAT TTCCTTGGAA AGTAAGATCG CCGCCCGCTT CTAGGAAGCA 

111301 TCCTGAGGCT AAGGGAATTC CTAAAGCCCC TGCATTTTGA AAGGATACGT 

111351 CGGAAAGTAA GGAATAGGTA GTTCCTGCAG CAGCGTCCGT AGTGGAAAAG 

111401 ACCGTGAAGG TAGTTCCGTT AGATCCATCA TAGCTATTAT TGCTGCTATC 

1114 51 TAAGGTCACC TCTGCCGCGA CTATAGAGAG CGATGAAAAG AGCGGGATTG 

111501 AAGAAAAGAA CAACCAAGAG ACAGAGGACT TCATTTGTAA GCACTTTTTT 

111551 GAAACAAGGA AATTAAATTA GCAAATACTG TAAAGAAAAA AAGAAATCAA 

111601 GGGAAACGCA AGGAATTGAT TGATGCGGAG AATCAGAACC CCAAGGATGG 

111651 CGGATCTTTT ACTTCTCTTC ATACGGATCC TAAGAATCTC TTTGATGAAG 

111701 AGGGGATGCC CTCCCCCTCT GATACCCTAC AGTGCGATCT CAATAACGTA 

111751 TTCATCTTTA -TAAAAAGTAT GTTTTTCTAA GATTCTCGGA GAATCTTAGA 

111801 AAGAATAACG AGTTCCACAG TTTGCATTAT AGCTTCTTGA GGAGCTGCGC 

111851 AGTTCACAAC TTCCAGAAGC GAAGCAGTCA AGACCATGAA GTAACTTCAG 

111901 ATGTCCAGAA GCCTCAGCAA AGAAAGCTTG TCGTGATAAG TTTGTAGCAA 

111951 ACGTAGACCA CGAGGTGCCA TTTGTTAAGG AGGTCAGGCA GTGAGGGTGA 
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112001 
112051 
112101 
112151 
112201 
112251 
112301 
112351 
112401 
112451 
112501 
112551 
112601 
112651 
112701 
112751 
112801 
112851 
112901 
112951 
113001 
113051 
113101 
113151 
113201 
113251 
113301 



TCCCGGTAAG 
GGGCTTTGCT 
TGACCAGATG 
AATCCTTTTT 
GCTGGTAAGG 
CAACAGCGAA 
GCAAGCTTTG 
ATGGAGAACA 
GGACGTGGCG 
GCACAGAGAG 
ATCCTTAGAT 
CTTGAGGGGT 
CGGAATCCTG 
ACCTCCAATC 
CGATCTCCTG 
TTAGGAACTA 
TTGCCATTCC 
CCAGAGTCCA 
GGAATCGGAT 
ATCTAAGAAA 
AACTTCCTTC 
CCACTTGTGC 
ATCCAGATTT 
TAGTTGTCGT 
TCAGGCTGCT 
AACGACATTT 
TATTTGCAGG 



CATCTACAGC 
GATTCGTGTT 
GCTAGCGTCA 
GATTTACACT 
TATCTGTAGT 
GCTATGGCTG 
TCGTCATATT 
AGGGGAGTTT 
ACGTAATGAG 
ATCCTGCATA 
TTGCCAAAGA 
GGTCATGCTG 
CATTTTCCTT 
CAAATCCCTG 
CTGTATAGAA 
AAGTCGCACG 
GCGACCAAAG 
ACTCCCTTGA 
TGAAGTCGTC 
GGAAGATTTA 
ATTGTTATGG 
TGTTTACGGC 
ACAGAGAGAT 
GGTCTCTAAG 
GTGTGAAGGA 
CCTCCTGCTA 
AATCGAGGCA 



GTAACCTAAA 
TGAAGGTGAG 
AAGATACGTG 
CACAACTTGG 
TTAGATCTAC 
TCCCAGTCTG 
ATGGTGGTTT 
CTCCTGGGAG 
CTATGCAGGG 
GACTTGAGAT 
GTTGGCTGAA 
CCACCAACAA 
TTGCTTGTCT 
GATGTGAGGG 
TGGATGTTTA 
AAGCTCTGGT 
TCACCTTCCC 
TACCCATAAT 
TAAATTTACA 
AGTTTGCTTT 
AATTTCAGAT 
AATCGTTATC 
TCTTTAGATC 
GTCGTTCCTG 
ATATACTTGT 
AGTTGATCTT 
TCTTGACTGG 
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GTAAGAAGCA 
TCCCATAGGG 
GATCAGCAGC 
AGTTTCACAT 
AGGAAGAGAA 
ATTTCCCTTG 
CTTCCATAGG 
CTCTGGAAGG 
GAATGACATA 
TTAATATCCG 
TGCAACAGCA 
TATAACCTCT 
TGATGGAAGG 
AGCGTCCGAC 
CATAAGCATT 
TTAGGAGTGT 
TCCAGCTCCT 
CCGGAGCAGC 
GTTCCTGAAG 
CAACCCAGGA 
CCCCTGAGAT 
ATACGCTTGC 
GATGCTGCCA 
CATCCATGAA 
AGGGTGGCTC 
CTGGTTCAGT 
GGAGTTTTCC 



AAGCACTGGG 
ATAGACACGT 
AACCTCTTGG 
AGGGAGAGTA 
CCACCGACTT 
TGTGTTGTTC 
AAACTTGACC 
ACCTTAGAGA 
AGAGCTCTGA 
AGACTACGTA 
AAGGTATATT 
GGAAATCAAA 
CGTTGCCAAT 
ATCGCAGTGG 
CCAAAGGCTA 
ATCCTAACGC 
ACTTTAGGAA 
CATGCTAGAA 
TAGAAGAAAG 
TTGTCATAGA 
TTTTAATCCC 
CATCTAAAGC 
TCTGTATTGT 
TACTGTAGAA 
CTTCTTTTAA 
ATGGTGGTAG 
AGAAGAAAAT 
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113351 ACTATAGTTC CCGTGTTTGG GTTTGCAGGT GCTACAGGGA CTACAGGCAC 
113401 TGAAGCTATA GGACCATTTT TTGGTTGGGG AGGAGGAACA ATAGCTTTGA 
113451 CAACAGGATT GATGACTAAC TCCTCTATTG TTCCTCCAGA TGCAGGAGCT 
113501 TCCATCGTAA TAGGATCATA AAAATAAATC GTATGACCAG GAGCTGCTGC 
113551 AAGCTTAGTG ATCTTAGCCC CTGCACCTAA ATGGATCGAG TTGGGAGTTG 
113601 AAGTTCCCTC AGTCGCTCGG TTCCCTGAGA AAGTAATATC CCCATCAATA 
113651 GCCTCTAAGG AAAGTTCTCC GCTATCGGCT ATATAAATGG CGCCTCCCTT 
113701 GCCTCCAGAA TTATTGGTAA AGGAGACAGG ACCGTTAGCT GTAATCGAAA 
113751 GGTTTTTCGA ATAAATCGCT CCTCCCGAAG TTTCAGCAGT ATTGCCATCA 
113801 AAGTTTATGG ATTCACTGCC TGAGATTACA CACTTAGGAG CATAAATACC/ 
113851 ACCACCACTT CTTTTTGCCG TATTGTTAAT GAAACTTAAA CTCTCATTTT 
113901 CAGTAAGAGT TAAGCTTTTT GTAGCTATGT CAGACTCTGA GATATTACAG 
113951 AGGATCGCTC CACCACAACC TTCTTGATCT GTAGTTGTTG TTGCTGTTGC 
114 001 TGTTGCTGAA TTTCCAGAAA ATACAAGAGC CTTATTTTTG GTAAAGGAAG 
114051 TATTTCCTTT AGTATGTAGA GCCCCTGCTG TCTTTGCTGT ATTTGTGCTG 
114101 AAGGTCACGG TTCCCGTACT TCCCGTAAGA GTAAAATCTT CGGTTTCTGT 
114151 ATAGATCGCT CCTCCTGCAG TTTCGGCAGT ATTGCCATCA AAGGTAAGAG 
114201 TCGTGTTTCC ATGCAGAGCA CACTTGGTCG CATAGATCGC ACCGCCACTT 
114251 ACTGTTGCAG TATTACCAGA GAGACTCACG TTTTCGTTAT CTTCAATCCA 
1143 01 GAGTCCTTTT TTAGTACTTA CAGATGCTGA CTCAAGAAAC GATAGGATTG 
1143 51 CCCCACCGCA ACCCTCTTGA TTTGCTGAAG AATTACTCGCS GCCCGTAGCT 
114401 l-TGTTCCCTG AAAAGAGCAG GTTGGTATTA CCAGACAGAG AGTTGTTGCC 
114451 TTTAGAATAT AAGGCGCCGC CTGTCTTTGC TGTATTTGTG CTGAAGGTCA 
114501 CGGTTCCTGT ACTTCCTGTA AGAGTAAAAT CTTCAGTTTC TGTATAGATC 
114551 GCCCCTCCTG AAGTTCCAGC AGTATTGCCG TCAAAGGTCA GGGAGCCGTT 
114601 TCCAGTTAGA GTACATTTGG TAGCATAGAT CGCACCACCA CTTACTGTTG 
114651 CAGCATTACT AGTGAGGCTG ACTTCTTGGT TGTTTGCAAT CGATAGTCCT 
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114701 GTTTTATCGC TTACGGATCC TGAATCAATA AAGGCTAGGA TTGCCCCACC 

114751 GCAACCCTCT TGATTTGCTG AAGAATTACT CGGGCCCGTA GCTTTGTTCC 

114801 CTGAAAAGAG CAGGTTGGTA TTTCCAGTCA GCGAGCTGTT TCCTTTAGAA 

114851 TATAAGGCGC CGCCTGTCTT TGCTGTATTT GTGCTGAAGG TCACGGTTCC 

114901 CGTACTTCCC TTAAGAGAAA AATCTTCAGT TTCTGTATAG ATAGCTCCGC 

114951 CACATCCTGC TGTCGCAGTA TTCTGATCGA AGGTAAGAGT TGTGTTTCCA 

115001 TCCAGAGTAC ATTTAGTAGC GTAGATCGCT CCACCATTCG CAGTTGTTGT 

115051 ATTACTAGTG AAGCTCATTT CTTGATTCTG AGAAATGGCT AATCCAGTTT 

115101 TGTCTGTTGC TGTAGCAAGA TAACAACAGA TTGCCCCACC ACAACCTTCC 

115151 GGGTTATTTG CCTGTGCTGC TGAGCCGGTT GTTTTATTTT CCTGAAAAAG 

115201 TACTTGAGTG TTGCCGGTAA GAGCAAGATT GTCATCAGAG CTCCAAGCAC 

115251 CCCCCGTCTT TGCAGTATTA GATTTGAAGG TAACGACTCC TGTATTGGCA 

115301 TCTAGCGTGC TATCCTTTTC TTTTGAGTAG ATCCCCCCAC CTTTATCTGT 

1153 51 AGCAGTATTT GAGGAGAAGG TCACCGTTCC TGAGTTTCCT TGGACTGTAG 

115401 TGTTTGCTGT ACTACAGAGG GCCCCGCCAT TTTTTGTGCT AGTATTTTGA 

115451 TCTAAGAGAG CTGCTGTCGT AGTCTTAGCA AGATCGATGC TGTAGGCAGA 

115501 AACTGCAGCT CCATCTTTTT CTGAAGTATT TTTTTGGAGG GTGACACTGG 

115551 CATTGTCAGT AAAAGTCGCA GTACCTCCCT CTGTATTTGT CACACAAATA 

115601 GCACCCTTGC CGCCCGAAGT TCCTGTTGCT GGAGCTGAGT CGATTAAGAG 

115651 TGACGAGAAT CCTGAGAAAG AAAGAGCTGT GTTGGTATTG TTAATTGCAG 

115701 CACCATCATG CGTAAGCGCT ATGGTTTGCA GAACCAATGA GTGATCAGCT 

115751 CCAACAAAAC TCAATGCTCC TCCTGTGTTT GTAAAACAGC TTTTATCTGC 

115801 AGGAGTAATT *GCAGATACAT TCGTAATAGA AACATCGCTA GTGAGAGTGT 

115851 AGGTAGTTCC TGAAGCATCC GAAGTTTCCT TGGCAGTGAA TGCTGCGCTA 

115901 CCACTACTAC CATTTTCATA GTTATCGGAT GATGAGAGAT CCGTGTTAGC 

115951 AGCCATTAGT GGATGTAGGG AGAAAACTAA AGCCGAAGAG GTAAGTAGCC 

116001 AAGGTAAAGA ATATTTCATG TGTCTTTGGG GAAAAGCTTT TTATCAAAAA 
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116051 TACTCCCATA GCATGTGGCT 

116101 AGTTAAATAA ATCAAGTAAA 

116151 TTTCAATCTT GGG AGAGAAG 

116201 CCCTTAAAAC ATAAGGGAAA 

116251 CGGTTTTGCT TTTAGACCTT 

116301 ATGGTTATCT AAACCACGGA 

116351 TTCTCTTTCG TCATAACCGT 

116401 CCGTGCTTTT TCCGTCTGGA 

116451 CCGCTTTTCT TCTTAGATTG 

116501 TTGCTGCTGT TTATCTTGGC 

116551 AAACTCGAGG AATTGAGGGT 

116601 GGGTGTAGAA CCGTCCCCGT 

116651 CATTATGGGT GTAAACGCAC 

116701 TTAAATCTTT GAAGTGATAA 

116751 ATAGATGCGG GTCGTGTAGA 

116801 TGAGAGGCTG CCGAAGGAGC 

116851 AGAAAGATCC TTGAGCCTCC 

116901 CGTTGTGAAG CTTCGTGTTC 

116951 TTTCTTTTCC ACTGCGATTT 

117001 GAATAGGCTC CGCGGTCTCT 

117 051 TCCCTACGGA AATAATCTAA 

117101 CCCCACCGTA ATCAATCCCA 

117151 TCGTCCCACT AGAGGCAACG 

117201 GCTAATACAA TGATAGGAAC 

117251 CTTGGCCTTC GGACTGTCCA 

117 301 AAGCTCCCAG GGCAAGTGCC 

1173 51 CCTGAAGCAG TCAGGAGAAT 



TTAGGAGCAT GGTGCACCAA TAGAGAATAC 
TGCTCTGGAG AAGACTCTCA GTTATAGAAG 
CATTTAAGGT ATTTTTCTAT ATTTAAGAGT 
TGCTTAAGGG TAGGGGAGAA GGTGTACAAG 
CTTGAATTTT AGAAGGAGAG AGTAAGGAAG 
TCCTTATTTG TTCTTTCTGT GTTTCCGTTT 
CAGAGAATGG ATTGGAGGGG CTGAGGTTAG 
GGCGGAGTTT TAAGAGATTG ATTCGATTTC 
CTGTTTCTGT TCTTCATCTT GATTTTGCTG 
GATCACGAGG GGAACGGCGG GAAGATGAAG 
TCTTTTCCTC CCTTAGGGTA GACCGGCTCA 
GGAGAAATTA GGAGAGCGGG AACTTGCAGG 
TGCTCGCTCC ACTACCGAAT GAACTGCTTC 
GGCTGAAAAT CGTCCTTAAA TGGAGGACTT 
AGCCGCATCT GAGGGTTTTG TTTTAAAACG 
GTCTATGCTC AGGATTTCGT GACGAATAGA 
CGACCGATCC TACGATGACG GGCATCTTGG 
CTCCATGCGG GCAGATGCAG AAAGATCTTT 
TTTTTGCATC GTCGGCGGAT GGAGTCACTA 
TGGGTTTTCT TTTCCATCCT CAAGAGGTGT 
GGTGAGCTTA TTCAAGGACA TGAAGTATAG 
TGACAAACAT AGGGTTGGCA AAGACTAACA 
AAAGCCCCTG CAATAAGACC CGCAGCAATC 
GACGATAGCA GTGATTGCCT CACCGATTTT 
GAATATCAGA AATAAGTAGA GTAACTCCCA 
GGAGCGAGAG CATAAAGCAT AAGGCTGTTT 
CGAAAGGATC GCAATTGCCG CAAGGGCAAT 
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1174 01 AATGCCTGTG TCATAGACAT AGCGCATTTT AGGATATTTA TCAGGAATAG 
1174 51 TAATAAATCT TTTGAATAAC CCTGTGGATG ACGTTTTTAA ACGTTTTACC 
117 501 ACGCTTGGCT GCCCTGATGG GGATGCTGCT GCAGG AACTT GAGGTTGACC 
117551 TAAAGGGTTT ATAGGGGGTT GGCTCATACT ATCAACTTAC TGTAATTATC 
117 601 ATTAGGCCCA TGAATTTTCA TTCATAGGAT ATATTTCATA CTATTATAAG 
117 651 ATTTAATAGG ATTTAGTTAG TTCTCTTTTC TTCTGAGTCT TAACTTTTTT 
117701 ATTAAATAAA GTTTATTTGT TAAAATCTTA ACAGATTTTT AACTAAAACT 
1177 51 TTAAGTTATT TTTATTTGGA ACTTTTAGTC GAAATAAGAC TCGCTTATGA 
117 801 GAGGGACATA CTCATCAGCA AATGGAGGGG GCGTGTGAGG TCGTGAGGGT 
117 851 GGAGGCTCAT ACGGGGGGAA AAGATTTTTA TGCACTTGGA AGAGGGTAAC 
117 901 GCTAGTAGTT ACAGACATGA GAAGCCCCAA GGAGATAAAG ACAACTGGAG 
117 951 GGGGGATAAA GACCAGACTA AAGACCAGAC CGATAATCAC AGCAATCGTA 
118001 AGAATGTGGA GGAGCCAAGC CATAGCATAC GAAGCAAGAT GTTGGCAGCT 
. 118051 TTTTGTTTTT ATAGCGTTAA AGAGAGCCCG TAGGGGTAGG GTCAAGGCAG 
118101 CATATGTAGA AGCGCAAGGG ATCAGGATAA CCTTAAGAGC TCCTAAGACC 
118151 GAGGATACCA GTACTTCTAT GGTAAGAGCA GTTCTGGGAT AGCTCTTCGC 
118201 ACAGTTTTTT AATTTTCGAG CTAGTATTCG TTCCGGAAAA ATGCCATTCA 
118251 GGTATAGCTG AGAGCCTTGT TTGCAGATAT TTTTGAATCC CATATCCGTT 
1183 01 TGAAAGAGAA TATTTTATGA AAAATTATGT AAAAATTCTA AGAGGATAGT 
118351 GGTTTTTAGA CAATCGAAAT TCCTGAAAAG GCAGGAAAAT GAGAGACACA 
118401 AGTAGACAAA ATCTCCTGAA GTTTTTGTAT GGGCCTGTAA AAAAATCTTT 
118451 CTGGAAACTG GAAATTAGAA GTTCATTACA GCGGAACCAC CGAAAAATGC 
118501 GGAGGTTTTA AAAGTATGGG ACGTTTTATT ATTGTTGTGA TCATCAGCTA 
118551 TCGAGATATC ATTCCCAATA GACCAACCGT AAAATCCCTT AATAAAACTT 
118601 CCCGGCCAAG GGGTGAGCTT CACGTTAGCT TCQATTTCAC GTCCTTGATA 
118651 TTCAAAGATG CCGCGAGAAG AAATTAGGTG ATTTTTGTGA AGTTTTTTTC 
118701 GGAATCTTGT AAGGCGGTAA GCAAGCCCTA AGTTACAGAC AGATGTCGAG 
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118751 
118801 
118851 
118901 
118951 
119001 
119051 
119101 
119151 
119201 
119251 
119301 
119351 
119401 
119451 
119501 
119551 
119601 
119651 
119701 
119751 
119801 
119851 
119901 
119951 
120001 
120051 



CGGTAATCAA 
GTCGGTAGCC 
GGAGGCCTGT 
GCCTGGTATT 
AATATTTTTA 
TTTTTAAGGA 
AAAGAAGTAT 
CGAAATAGGA 
TAGAAAACAG 
GTTCCATCGT 
TTTAGCATCG 
GAGGAGGATT 
ATAAAAATAG 
GAGCAGTTTG 
AATGGAATAT 
GGAGGGACTT 
GAATTTTTCT 
GCTTAATACT 
TTCTGTGCTA 
GATCGATTCC 
GATTTTGTGT 
ATCAACCCGT 
AGATATCTCA 
TTGCTAGAAG 
ATGCCACAGA 
CGTTCCCCCT 
TTGTGGAAAG 



TAGAAAAATT 
TTGTAACTAA 
TTCATTAATG 
TTCCAGAAAG 
GGATCCACAA 
GAGTGTATAA 
CTTGGAAAGT 
AGTGAGCTTT 
TAGCCCAGAA 
ATTGACGATA 
TTCACCTTTA 
TTTACATGCT 
GAGTCGAGAG 
AACATGTTCT 
ACGTAATTTT 
TCAAGAAGAT 
AATCAGCAAT 
ACGCATAAAT 
ACTTGTTTTG 
TTTTTCGCTT 
TGTCAGCTCT 
ATCTCTGAAC 
TCAATCAGTT 
AAGAGCAAGA 
TGATCCAGTT 
AAGGTTAACA 
TTTGGAGCGA 



CACAGGATAG 
CACCTACTAA 
ACGCCAAAAA 
AACTCCTTGA 
GCCCAGAAAG 
GCTCCTAAAG 
CGCCCACCCA 
TCCATTGAAT 
TGCTCTGTAA 
GCCTATAGTT 
GGTATTGTAC 
TCTTCATCAA 
TACGTGTCCC 
GTAAGATTCT 
TAAATTACTA 
TTCCCTAGAT 
CGAACAAAAT 
GATTTCTGAG 
GCACTCAGTA 
CAATTTCATT 
GGAAGATGCA 
AAATCCTGAT 
TTGTTCCTGT 
ATCAGTGTGA 
TCCAGTATAT 
GTAGTGTTAA 
AAGGTCTCCC 



ATGCAATTGA 
AGGCCAAGCC 
TAGCAGAAAG 
TAGAGTCCAT 
AATGATAGAC 
AGAGGAGAAC 
AGTCCATTAG 
ATCCGCACCT 
TCGGAAGTGT 
TGATGAGGCA 
CTGAGCAGAG 
TTCCACAAGC 
GCAAATGCAG 
CCAACGTTAC 
TCTAATAATT 
TTAGGGGGAT 
TACAGGGTTC 
ATGATTTCCG 
ATTTTAAGCT 
TAAAATTCTT 
CGCTTTTAAT 
CCTATAGAGG 
CTGCTTATTT 
GTACGATAAT 
CCTGCGTAAC 
GGCAAAACAG 
AAGGAGCCGC 



GAGTTAGTTG 
TTCTCTTGAT 
CTTCTCAGTG 
AACCCATCTC 
CACTGCCAAT 
ATAGTTATAA 
GATCTGTCTC 
ATATAGCCAG 
GCAGAGAAAC 
GCTTTTTAAA 
AAAGGACGTG 
ATCTTGAACA 
CGATGTGGAA 
TAGAGATTGA 
TTCCTACTCA 
GAAAGGACTA 
TTGGAGAGGT 
TTTCGGCTGT 
CAAGATTTTG 
GTTTTTTCTT 
TGTAGAGAAC 
AAATGTCTTG 
TTAAGAGGGA 
TCCCAAGCCG 
CTGCTGAGAT 
AGCGTATGCT 
TGGAATCGGA 
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120101 TGAGGCTGCG TAATATAAGC ATTAGGAAGA TTGAGAACTT CTGTAGCATG 

120151 AGAGAGAGGG CTGCTCTCGG GGACTGGTGA TGGGGCTACG GAAGTTGCCA 

120201 TGTTGTTTCC TCGAATCGTT GCTAACTAGT TTTGATTTGT CTTTTCATTC 

120251 TTGTGAGAAA GCTCAAAACG TTTTTGATAA AGAGTTGATT GCAGTTTGAA 

120301 CTCTTTTTGT GAGAGGTCGC AGAGTTCTTC GTATAACTTT AGCGTATCTT 

120351 GAGTGGTTTT TCTCTGTGCA TTGGTTACAA GCTGTAGAGA GGATTCAGGT 

120401 CGAATCTTCT CCCTGATAAA TTGTACTGTC GTATGGAGCT GCTGAGGGAG 

1204 51 CTGCCGGATG AATTTAATAA ATCCTACCAA GGCTTGTAGG CAGAGAAGAG 

120501 TCAAAATAGT AAGAACAATG CChACGGCAA TCAACAGAAT GCTTTGGCTA 

120551 TAGCAACCCA AACAGATAAT AGCGATTCCA GCAAGAGCTA CAATCACTAA 

120601 GATCGTAATC GCAGCAATAT GCATAGGAAT GGAGTGGGTG AGAAAAACAT 

12 0651 TGCCCTTCTC CCCCAAAGCT ACGATCTCCT TATTCGTAAT GAATAAATCA 

1207 01 GCAGACTCTT CCGGAAGGGA TGAGGGAAAT ACCCCGTTTA AAGTACTAGA 

1207 51 CACAAAGAGA ACTCTATTAT TTGAGGAAAT AATTTAAGAA AAATGGTATT 

120801 TTTAGTCAAT TAGTAAGCGA GTCATGCCTC TTAGTTATTC AAATTTTTAA 

120851 AACCTTACCC TTCCTATGAG GAGACAAGTA AGAGAAATTA TGCAACAAAC 

120901 TGTAATTGTA GCAATGTCAG GAGGCGTGGA TTCTTCTGTC GTTGCCTATT 

12 0951 TATTCAAAAA ATTTACCAAT TATAAGGTTA TTGGCATCTT CATGAAGAAT 

121001 TGGGAAGAGG ATCGCGACGG CGGTCTCAGC TCGACTACTA AAGATTATGA 

121051 TGATGTCGAG AGGGTCTGTC TTCAGCTCGA TATACAGTAT TACACCGTAT 

121101 CTTTTGCTAA AGAATATAGA GAAAGAGTGT TCGCTCGTTT CCTCAAGGAA 

121151 TACTCTTTAG GCTACACTCC TAACCCCGAC ATTCTTTGTA ACCGAGAAAT 

121201 CAAATTTGAC CTTCTACAAA AGAAAGTCCA GGAACTTGGC GG AG ATTACC 

121251 TCGCTACAGG GCACTACTGC CGATTAAATA CCGAGCTCCA AGAAACCCAA 

1213 01 CTCCTTAGAG GTTGCGATCC TCAAAAAGAT CAGAGCTATT TTTTATCAGG 

121351 AACTCCTAAA AGTGCTCTTC ACAATGTGCT CTTTCCTCTT GGGGAAATGA 

121401 ATAAGACTGA AGTTCGTGCG ATTGCAGCTC AAGCAGCTCT TCCCACAGCA 
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121451 GAAAAAAAAG ATAGTACAGG CATTTGCTTT ATAGGGAAGC GCCCTTTTAA 

121501 AGAGTTCCTA GAGAAGTTTC TTCCCAATAA AACAGGCAAC GTTATCGATT 

121551 GGGATACCAA GGAAATTGTA GGGCAACATC AGGGAGCTCA CTATTATACT 

121601 ATAGGGCAGC GGCGAGGACT TGATCTTGGA GGATCCGAGA AACCCTGTTA 

121651 TGTTGTGGGA AAAAATATAG AGGAAAATAG CATTTATATT GTGAGGGGGG 

1217 01 AAGACCATCC CCAGCTCTAC CTACGGGAAT TAACAGCTAG AGAGCTCAAT 

1217 51 TGGTTTACCC CTCCTAAATC CGGATGTCAC TGTAGCGCTA AAGTCCGCTA 

121801 CCGTTCTCCT GATGAAGCTT GCACGATAGA TTATAGCTCA GGTGACGAGg" 

121851 TCAAGGTGCG ATTTTCACAA CCCGTCAAGG CGGTAACTCC AGGACAAACA 

121901 ATAGCGTTTT ATCAAGGAGA TACCTGCCTT GGTAGTGGAG TTATCGACGT 

121951 TCCTATGATT CCAAGTGAGG GCTAGGGAGA GCAGCTTCCT GCTCCTCTTC 

122001 TTCCCTTTCA AAGGCAACGC GATTTTCAAC CAAGGTTGCT CGTAGCTTGC 

122051 GAGCTTCTTG ACGGCAGGAC TCTTTAAGCA AGAGCTCCGC TAGAGGATCT 

122101 TCAAGGTACT GCTCAATGAC ACGGCGTAGA GGACGTGCTC CCATTTCTGG 

122151 AGAATGCCCC TTCGTTACTA GGAAGGAAAT CACAGAGTCT GGGATGTTCA 

122201 AAGCCATTTG GTAGTTTTTC AGTCTCGAGT CCAGTTTGTT GATCTCTAAA 

122251 TGGATGATCT CCGATAGAGA TTCTTTCTCG AGGGGACGGA AAATCACACT 

1223 01 TTCATCCAAA CGGTTAATGA ACTCAGGCTT TAAGTGTTTC TTCATAGCAT 
122351 GTTCGATTTT CTCTTGGATG ACCTTATAGT CCATATGGGA CTTCAAGCCA 

1224 01 AAACCAATTT CTCCGCTTTT ACGAATGAGA TCAGCTCCCA AATTGGAGGT 
122451 CATGATAATA ATGGCATGAC GGAAATCCAC TTTGCGACCA AAAGAATCAG 
122501 TAAGACGTCC TTGCTCTAAA ATTTGCAACA TCAGGTCCAT AATGTCTGGG 
122551 TGTGCCTTTT CTATCTCATC AAAGAGAACA ACGCAGTAAG GACGGCGACG 
122601 TACCTGTTCC GTAAGGTGGC CCCCTTCTTC ATGACCTACA TATCCTGGAG 
122651 GTGATCCCAT CATCTTGGTA GCAGCAAATT TCTCCATGTA CTCTGACATG 
122701 TCTACCTGAA TCAGAGCGTC TTCACCACCG AACATCTCTA TAGCAATTTG 
1227 51 TTGGGCGAGC AGGCTTTTCC CTACACCGGT AGGCCCAAGG AATAGGAAGG 
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122801 AGCCCGTAGG TCGGTTAGGA 

122851 GCACGGCAAA TGCTGGTAAC 

122901 TAACGTGTCT TCTAACTTCA 

122951 TTGCTGAGGG AATTCCTGTT 

123 001 TCATCTACAG GAACTTGGTG 

123051 CATACTTTGC AGACGTTCGC 

123101 CAGCTTTTTC GTATTCTTGA 

123151 GTATTTTCGA TTTCAGCCTC 

123201 TGTATTCACA CGGACACGAG 

123251 TATCAGGGAG GAAACGTCCA 

123 301 GCTTTTAAAG CTTCTTCAGT 

123 3 51 TTTCTTGAGG CCACGTAAAA 

12 3401 GAACCACGAT TTTTTGGAAA 

12 3451 TGCTTGCGAT ACTCATCTAT 

123 501 TCGCGCTAAC GCAGGTTTTA 

123 551 CTGCTCCTGC TCCTACAATC 

123 601 TTTCCATGCT TGCGAACTTC 

123 651 TTGCCCTCGA TATTTTGTTC 

1237 01 TCAGTCGCTT TTTCCGTAAG 

1237 51 TGAGCCAGAC CCTCAACAAT 

123801 AAGTACAGGA TTGTTTTTTC 

123851 CGACTTCTGA AGAACGACCA 

123 901 ATCTCCGTTA AATCATAACC 
123951 TTTGTCAGAA CCTAAGCTAT 

124 001 TGCTTCGAGA GGATGAGGAA 
124 051 TTGAAGGTCT CTAATTCTCT 
124101 GATATGTAAG TTTTCTAATA 



TCTTTGATCC CTGTTCGAGA ACGTCGGATG 
GGCATCATTT TGACCAATGA CTTTTCTTCT 
GAAGCTTCTC ACTTTCAGCT TCTGTGAGCC 
TGTAGAGAAA CTACCTGAGC GACTGCTTCT 
CTCTTCTTTA TGATTTTCCC ATTCCTGTTT 
GAAGTTTTTT CTCTTCATCA CGTAAACCTG 
GTTCCAATGG CCTGCTCTTT GGCCAATTTT 
TAGCTTCATT AAAT.CTGTAG GCTGACCCAT 
CCCCAGCTTC ATCTAAAAGA TCTATTGCTT 
TGAACATATT GATCAGAAAG AGTCGCAGCT 
AATGAAGACA TTGTGATGTT CTTCATACTT 
TCTCAATAGT CTCATCTACA CTAGGAGGGT 
CGACGTTCTA AAGCTGCGTC TTTTTCTATG 
CGTAGTTGCT CCAATACACT GAATTTCACC 
AAATGTTTGA AGCATCGATA GCACCTTCAG 
GTGTGGAGCT CGTCAATGAA GAGCAAGATG 
ATCCATGACA GCTTTGATCC GTTCCTCAAA 
CAGCAATCAT TAATGCTAGA TCTAGAGTAA 
GCATCAGGAA CCTCATTCAG AATGATTTTT 
TGCAGTCTTA CCAACTCCAG CTTCTCCAAT 
TTCTTCGGCA AAGAATCAAA ATCAACCGTT 
ATGACAGGAT CGAGCTTAGA CTCTCGGACC 
ATATGCTTTC AGAGCAGAAA GCTTTTCGTT 
GACCTAAAGG AGATTTTGAA GATGAAGGGT 
GAAGACGACG ACGAAGGAGG AAGTTGTAGA 
AAGAATTTCC TTACGAACCT CTCTTGGATC 
CCTGAAGAGC GACACTATCT GATTGATGTA 

92 



wo 00/27994 PCT/US99/26923 

124151 GGATCCCTAA GAGTAAATGC TCCGTCCCGA CATAATTGTG CTCTAAAAGG 

124201 CTGGCCTCTT CATTTGCTGA TTCAAAAGAT TTTTTTACTC TTCCTGTAAG 

124251 GGCAGGGTCT CCGTAGACTT GAATTTCTGG ACCATAACCA ATCAGGCGTT 

1243 01 CCACCTCTTG CCGTGCCGTA TCAAAATCTA TACCGAGGTT GCGTAATACA 
124 3 51 TTAACAGCTA CCCCTTGACC AAGTTTGAGA AGACCAAGCA GGATGTGCTC 
124401 AGTACCCAGG TAGTTATGAT TTAAACGCTG AGCCTCCTTT TTCGCCAGTT 

1244 51 TAATGACTTG TTTTGCTCTA TTAGTGAACT TCTCAAACAT AAAAACCTAA 
124501 AAGACAGGGG TAGAACTTTC CTTAAGCATA TACGAAATTT AAAATAATGA 
124 551 TGCAACTCTT CGCTCTAAAC CAGCAAATTT GGTAAAATTC CTCTGAGTTT 
124 601 AAGGGAAAGT TATGCACAAA' CCTTTTGTAT ATGATACAAT AGTTCAGCTT 
124 651 CTTTTGAAAC AGTCTTAATT AGTTTTATGT TTGTTATATG AAAGTTCGTA 
1247 01 TCGTAGATTC AGGAAAATCT TCAGCGGCCT CCCACATGGC TAAGGACAGA 
124751 GATTTATTAG AATCTCTGCA AGATGGGGAG CTCATTTTAC ACCTTTATGA 
124801 GTGGGAGAAT CCTTGTTCTC TGACGTACGG TCACTTTATG CGTCCAGAAA 
124 851 AATTTTTACT TTCCAACTAT GCGGATCTAG GATTGGACGC CGCAGTGCGG 
124 901 CCTACGGGAG GGGGATTTGT CTTCCATAAG GGAGATTATG CTTTTTCTGT 
124 951 TCTTATGTCT GCGACACATC CTTCCTATTC TTCTTCGGTA CTTGAGAACT 
125001 ACCATACTGT AAACTCTTTT GTAGCGAAGG TTCTAGAGAA AGTATTTCGG 
125051 ATCCAGGGAA TGTTAGCTCC AGAAGACGAA AACTCTTCTT CCAGAGATTC 
125101 AGGAAATTTT TGTATGGCAA AAACTTCGAA GTATGACGTT CTTTTTGGGG 
12 5151 ACAAGAAGAT AGGGGGCGCT GCCCAACGCA AGGTGCAACA GGGATTTTTA 
125201 CATCAAGGAT CCTTATTCTT ATCGGGAAGT TCTTCTGAGT TTTACCAGAG 
125251 ATTTTTAAAA CCCGAGGTTC TTGAAGAAAT TATTGAACAA ATCCAGATTC 
125301 ACGCGTTTTT CCCTTTAGGT TTGGAAGCTG CTGATGAAGT GCTGCAGGAG 
125351 GCGCGTCAGC AAGTCAAAGA GGCGTTTATT AAATTGTTTT GTGGTGAGGG 
125401 GTTATGATGA GTCGGTTGCG TTTTCGCTTG GCAGCTCTTG GAATATTTTT 
125451 TATTTTGCTG GTTCCTAATT CTGTTTCAGC AAAGACAATC GTAGCTTCAG 
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12 5501 ACAAGGAGAA GGTTGGAGTT CTTGTTTATG ACAATAGTGT AGAGGCCTTT 

125551 CAACAGATAT TGGATTGCAT AGATCATGCA AATTTTTATG TAGAACTGTG 

12 5601 TCCCTGCATG ACAGGAGGCC GAACGCTTAA AGAGATGGTA GATCACCTCG 

125651 AGGCTCGTAT GGATCTGGTT CCAGAGCTCT GTAGCTATAT CATTATCCAA 

12 5701 CCCACGTTTA CCGATGCTGA AGACCAAAAA TTACTCAAAG CTCTCAAAGA 

125751 ACGTCATCCC AACCGGTTTT TCTACGTTTT TACAGGGTGC CCACCCTCAA 

125801 CAAGCATCCT CGCTCCTAAT GTCATTGAAA TGCATATCAA ACTTTCTATC 

125851 ATCGATGGGA AATATTGTAT TTTAGGTGGT ACCAATTTTG AAGAGTTTAT 

125901 GTGCACTCCA GGGGATGAGG TTCCTGAGAA AGTGGATAAC CCACGTTTAT 

12 5951 TTGTCAGTGG AGTGCGTCGG CCCCTAGCAT TTCGTGATCA GGATATCATG 

12 6001 TTGCGTTCTA CAGCATTCGG TTTGCAGCTC AGAGAAGAAT ATCATAAGCA 

12 6051 ATTTGCTATG TGGGACTACT ATGCACATCA TATGTGGTTC ATTGATAATC 

12 6101 CTGAACAGTT TGCAGGCGCC TGTCCTCCAC TGACTTTAGA ACAAGCCGAG 

126151 GAGACAGTAT TTCCTGGATT TGACAAACAT GAAGATCTTG TTCTTGTCGA 

12 6201 CTCTTCCAAG ATCAGGATAG TTTTAGGTGG TCCCCACGAT AAGCAACCCA 

126251 ATCCTGTGAC TCAAGAATAT TTGAAACTTA TCCAGGGAGC TAGATCTTCT 

126301 GTGAAGCTTG CTCACATGTA TTTCATCCCT AAGGACGAGC TTTTAAATGC 

12 63 51 TCTTGTCGAC GTTTCTCATA ATCACGGTGT TCATCTGAGT TTAATTACGA 

12 64 01 ACGGCTGTCA TGAATTAAGT CCTGCAATTA CAGGACCCTA TGCTTGGGGA 

1264 51 AACCGTATTA ACTATTTCGC CTTGCTCTAT GGGAAACGGT ATCCTCTTTG 

12 6501 GAAAAAATGG TTTTGCGAAA AGCTAAAACC TTATGAGCGG GTTTCTATTT 

126551 ATGAGTTTGC TATTTGGGAA ACGCAGTTGC ACAAGAAGTG TATGATTATC 

126601 GATGATGAAA TTTTTGTGAT CGGAAGTTAT AATTTTGGAA AGAAAAGTGA 

126651 TGCCTTTGAT TACGAAAGTA TTGTAGTTAT CGAATCTCCA GAAGTCGCTG 

12 6701 CAAAAGCTAA CAAAGTCTTC AATAAAGATA TCGGATTGTC GATTCCTGTA 

1267 51 AGTCATGGCG ACATTTTCTC TTGGTATTTC CATTCCGTAC ACCACACTTT 

126801 GGGACATTTG CAGCTGACCT ATATGCCAGC CTAGCGTCCC TGGGTGCGAA 
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126851 TCTACCAACA GGATCTCTTC TGCAGGCTCT GCAGGGATCC TGCCTGGTTT 

126901 TTTTCTCTGC TATCGTTTAC ACTACGCTTT TATTGTTTGG GTAGAGGGTG 

12 6951 GACCTTGTTA TCGTTCTTCT ATAAGCATCA AAAAAAATTT ATCGGCATTG 

127001 TCATTGCTGT AGTTTGTGTT TCTTGGTATT GGAGTGGGTT GGGGACGATT 

127 051 CTCTAGAAAA GGTTCTGCAG AGTCCACCTC ACGTCGGACT GTTTTTACTA 

127101 CCGCTTCAGG GAAGCGGTAT GTAGAGAAAG ATTTCATGGC TATGAAGAAG 

127151 TTCTTTGCTC ACGAAGCGTA TCCATTTACA GGGAACCCTA GAGCTTGGAA 

127201 TTTTATCAAT GAGGGGCTAC TTACTGATTA TTTTCTAACG ACAAGGGTGG 

127251 GAGAAAAACT CTTTTTAAAA GTGTACCATC CGGGAGAGAA AATTTTTAGT 

1273 01 AAGGAGAAAG CTTACCAGCC GTATCGTCGT TTTGACGCTC CTTTTATTTC 
127 3 51 CTCTGAAGAA GTTTGGAAAT CTTCAGCTCC CCAGCTTTTA GAGATCCTGA 

1274 01 AGGTCTTTCA ACAAATCGAG AACCCCATAT CAAAAGAAGG ATTTCTTGCT 
127451 AGAGCCAAGC TCTTTTTAGA AGAGAGAAGG TTCCCTCATT ATGTGCTTCG 
127 501 ACAAATGTTG GAGTACCGCA GGC AAATGTT TGCTCTTCCC CCAGATGAAG 
127 551 CCTTATCTCG CGGGAAAGAC TTGCGGTTAT TTGGCTACCA GACGATTCAA 
127 601 GACTGGTTTG GGGATGCCTA CCTTTCTGCT GCTGTTGAGC TCTTGATCCG 
127 651 CTTTATTGAC GAGCAGAAAA AAGTACTTCC CAGGCCCTCA AAACAAGAAG 
1277 01 CTCGTGACGA CTTTTATGAT AAGGCGAAGC ATGCCTATAC TAAGATCAGT 
127 7 51 AAGAATAAGG AATTTTCCTT AGGATTTGAA GAATTTGTAA ACTCGTATTT 
127801 TCAGTTTTTA GAGATCTCTG AGTCCGAATT TTTCAATATG TATCGAGACA 
127 851 TATTGTTGTG CAAAAGAGCT CTTCTCCTAT TGCAGGGAGG CGTTTCTTTT 
127901 GACTTCCAAC CTCTAACTAC ATTTTTCGTT CAAGGAAAAG ATTCCATACA 
127951 AGTAGAGTTC TTTAGACTCC CTAAGGAGTA TAGCTTTAAA ACAAAACAAG 
128001 AGTTAAAAGC TTTCGAAGTC TATTTAAAGT TAGTGAGTTT ACCTAAATCG 
128051 GATAGTTTGG ATGTTCCTAA TGAGATCCTT CCTATAGCGA CCATAAAAGC 
128101 TAAAGAGCCT CGGTTAGTAG GCAGACGGTT TTCTATAGAC TATAAGAGAG 
128151 TCGCTTTGCA AGACTTAGCA GCTACTGTAC CTATGGTTGA AGTGCTGCAC 
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12 8201 TGGCAACAAA ATTCTGAGCA CTTCCAGGAG ATTCTCCAGC AGTTTCCTGA 

128251 CGTTGAGACG TGTCAGTCGT ATAAAGACTT CCAACATCTT AAGCCTGCGC 

1283 01 TGCGAGATAA AATTTCTCTT TTCACACGCA AGGAAATCTT AAGGGCCCGC 

1283 51 CCTGAGAGAA TTCTGCAATC GCTACAGCAA GTTCCTAAGC AGAGCCAAGA 

128401 AGTTCTCTTA TCTGCAGGGA AGAATAGTGC TCTACCAGGA ATATCCGACG 

128451 GTCAGCAATT AGCCAAAGTG TTGCTTGAAA ACGAGGTTTT AGATTTATAT 

128501 AGCCAGGATG CAGAGACCTA TTATACTATT ATTGTTAATA GTTCTTTTGA 

128551 AAAAGAAGAA GTGCTTCCTT ATCGTGAGGT TTTAAAGAGA GATTTGGCCT 

128601 CACAGTTACT TACTTCTCAT GGTCATCTTG TTGACATGGA GCGTCTAGAA 

128651 TCTGCGTTGC GTACACGGTA TCCAGGAGAA GAAGGCGCTA GCCTATGGCA 

1287 01 ACGACGTCTT TGGAAGGTAG TGGAAAACCA CAGATTGGGA AGGCATCTCG 

1287 51 AGGGGTCTTT CTCTTGGAGC TTAGATCGCT CATTGAAGAC TTTTTCCCGA 

12 8801 GGAGACAAGG AGCTGCCCCA AGAGTTTGAT AGGATTTTCT CTATGAAGGT 

128851 AGGAGACTAT TCTTCTGTAT TCATGAGTCC TAACGAAGGG CCCTGTTATT 

128901 ATCAATGCCT CTCTCATTTA CTGTATGATC GTCCTGCTAG CGTGGATAAA 

128951 CTATTTTTAG CTAAAAGTCA GCTAGATGAA GAACTTTTAG GATCCTATAT 

12 9001 GGAACGCTTT ATAGAACAGG GAGTCGTAAG GTGATGTGGT ATTCTGATTA 

129 051 TCATGTTTGG ATTTTGCCCG TCCATGAGAG GGTGGTGCGC CTCGGGTTAA 

129101 CAGAAAAAAT GCAGAAAAAT TTAGGAGCCA TTCTCCATGT GGATTTACCT 

129151 TCAGTAGGGA GTCTATGTAA AGAAGGTGAG GTTTTAGTCA TTCTGGAATC 

129201 TTCTAAATCT GCTATAGAGG TGTTAAGTCC TGTATCAGGA GAGGTTATCG 

129251 ATATCAACCT TGATTTAGTG GATAATCCTC AGAAGATTAA CGAAGCTCCA 

129301 GAAGGTGAGG GATGGTTGGC TGTAGTCCGA CTAGACCAGG ACTGGGATCC 

129351 TTCTAATCTT TCTTTGATGG ATGAAGAGTA AATTTTTTAT TAGATATACT 

129401 CATTTTTTTC AGAAGATAAG AGGTATTTTT TTAAGGCTAA AACATTTAAA 

129451 ATTTATGTCT AAGGTTTAAA AAATACATCA GAATTATTCT ATGGATCCAG 

129501 CTAGTCCGGT AGCCCCTCAT GTCCTACAAG ATCATGTGCA ACTATCTTCT 
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129 551 GAAGAATTGT CCGCATTATC 

129601 TATAGCCATC ATGGTCCTTT 

129651 GCCTATTTTT AACGGGATCG 

1297 01 GCGAGTTGCA TTACCTTATC 

129751 GATTTCCAAT GCCTTAGAAA 

129801 GACATTTTAT TGATTGAAAA 

129851 CGTTCTCGAC AAAATCCCGA 

129901 TGTCCTTCTC CATTCGGAGA 

129951 TGCTTGCTAT TCTCCTCACT 

13 0001 TTTTCTTGTG AAGGTTCTCA 

13 0051 AGCTCTTGCT ATTTGTGTTT 

13 0101 GTAAAATCGC CACAGCTTGC 

130151 ATTGTTTAGA AGCATCTCTG 

130201 AGGAAAGTTC CTAGAGAGAG 

130251 GGCTGGTGTT TGGACGAAGA 

130301 AACTTCAGTC TCTAGAGGAG 

13 03 51 TATCAATACC TGTTAAGCTC 

13 04 01 ATGAAGGAGA TCCCCATTCC 

130451 GAGTTTTTCT TTGATAGTGG 

130501 ATCCCGCTCC TAAAGAGAAG 

130551 GTGATTCCGA AGAGCACACC 

130601 GGGAAGGAAG ATCCCTAAGG 

130651 AAAGAAGCTC TAAGATTTGC 

130701 ATCAGCTCCA GAAAACCTAG 

130751 AGTTAGAGCT TTAGGGCCCG 

130801 TCCCTGTTAC AGTGAGAACA 

130851 GTAGAAACCC TAGTAGAGCA 



TTCCGGGGTA TCTCGTGTGA AGAAGCTTAC 
CATTGATAGC GATTTCTTTG GTAGCCTGTG 
GCACCTCTAC AGCTCTCGAT CTGGATTGCT 
TATGTTAGTT TGTGCGTGTT GGCGTTATAA 
AAACTAAGGT AGCGCATGAA AGCTGAGTTG 
ATCATGACTA CATTACCTAA GTACGTTCCC 
TACTCTGACC TTCCTAAAAC GGTATTCTAG 
ATTCTTTATC TTATCGGATT TTTGCGAAAG 
TCGTTAGCTG TAGCTTTCGC CGTGACTTTG 
ACTGAGACTC TGCGCTCTCT ATATAGGTAT 
TACTGACGAT CGTTGTTTAT TGTATCGCAA 
AAAAAGCCGC CTTCCATATC TCGAATTGAA 
TGTACAAAAG TTCACTAGAA ACTCGACTCT 
CGACGCTGCG TTCGTGCTTT AGAAATACTT 
TTCTTTTAGT GGATTGGTGG TGTTTTCAAC 
CTCTTTGAAT CTTTGCTGAA GGTTTAGAAA 
ATAAAAGCCA TAGCAATGAG GCCTGTTGTA 
CTGGAGGTTT TTGGGAATAT CAGAGTAGGC 
CTAAAATAAC AATAGCGAGC CACCACCCAC 
ATCATCATAG GAATAAAAGG ATAACTACGT 
CCCTAGGATC GCGCAGTTCA CAGCAATCAA 
AGAGATATAG ATTCCTGGAG ACCTTTTCTA 
GTGAATGCCG CAATCACCAC GATGAAAATA 
GTTTACAGAA GCTAAAGATG GAGAGATCCA 
TGATGAAAGC ATGGACAAAC CAGTTGATGC 
AGGGCTACGG ACATCCCCAA GCCATTGGCT 
AGCAAGGTAA CTACACATCC CCAAGAAATT 
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13 0901 CGCAAGAAGG ATATTCTGAA TAAAGGCTGC TTGTAGAAGA ATACCAAAGA 

13 0951 CATTAAGCCA AGTATACGCA CCTAACCACA TAAACTACCT TTTTCTCTTT 

131001 TTAGAGTCTC GAATGTTAAC AAGCCAAATC ATAATACCAA GTAGGAAAAA 

131051 AGCCGACGGT GCTAGCACCA TAAGACTTAA ATTTTGGTAT CCATCGGGGT 

131101 GGGTTTCGGA AGCATAAACA AATTGAGGGA TGATGCGAAA CCCCATAAGA 

131151 GTTCCAAAAC CAAAGAGTTC TCTGATGACT CCAATGACAA GTAAGACCCA 

131201 GCCGTATCCT AAGCCAGAGG CAAACCCATC TAAGAACGCT GGAATAGGAG 

131251 TCACATGCCT AGCTAGACTT TCAGACCTTC CCATCACGAT GCAATTGGTG 

1313 01 ATGATAAGAC CCACAAAAAC AGAAAGTGTT TTGGAAATAT CAAAGAAAAA 

1313 51 AGCTTTTAAA AACTGGTCGA TAACAATCAC AAACAAGCTA ATGATAATTA 
131401 GCTGAGTAAT CATTCTCACA CTGTCAGGAG TGAACTTACG TAATAAGGAA 

1314 51 ACAAAGAAAG ACGAGCATCC TGTAACAATG CTGACAGCAA TTCCCATAGT 
131501 -AATTGCCGTT TGTACTGTTG TTGTCACTGC CAGAGCCGAG CAAATCCCCA 
131551 AAATCGCAAT GAGAATTTGG TTGTTGCTCC ATAGAGGATC AAAGAAATAG 
131601 CTTTTATAGG ACTTTTTACT TGTCATTCGC CTGTTTTCTT TTCATGGGTT 
131651 AAATTAGAAA AATTTATAAG GAGCTGACGA TAGCAAGCCA GAGATTGTAC 
131701 ATAAGCTTCA GTGACACCGT TGCATGTTAA GGTGGCTCCA GAAATCCCAT 
131751 CAATAGCAGA AAGAGCTTTT GGAGAATCTC CCAAAGTAGT ACGCACGGAA 
131801 CCTTTAACTA CCTCAAGCCC TAGGTCTGTT GTTGCAAAAT TTGTAGTTCC 
131851 AGAAGAATCT TGTAGGAAGA TTTTCTTCCC ATAGAATTGC TCTTGCCATT 
131901 CGGGATTTGT AATATTTGCT CCTAAACCTG GAGTTTCTCC TTGTTGGTAC 
131951 CATGCGGTTC CCAATACAGT GTCACCGTCG TTTTTCACTC CTAGATAGCC 
132001 ATGGATGGGG •CCCCAAAGGC CGAATCCTGA TATAGGGAAG ATCAAAGCTT 
132051 GAACTGTAGA AAGGTCTTTC GCAACGTCGG CTCCTGACAT ATTTTCTGTG 
132101 CGAGAGGTAT TCTCTAAAAT GACATAAAAG GGGAGGGGGG ATTGCTGACA 
132151 CGGAGGGGTT TCTTGATATT TCTCAAAAAA TTCAATGGGA TTCAGATTTT 
132201 TTTCTTCAAA AGAAAATACC TTGCCTTGGG CATCTGTAAG TAGAGGACGG 
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132251 ACAAAGCGCT CGGCATACAG CTCTAATTCA GGATAGGAAA CCTCAGAGAC 

132301 TTTTTTTGTA GCAACTTCAA GAAGTTGTGT TTTTTTATCG AAAGTCGCAG 

132351 GCACCCACTC txxxTTTTCC TGAATTTGAA ATCTTCCTTT AAAATCTAAA 

132401 ATATGAGCAG CTAAAAGCAT TTGCTTATTG CGATCGAAAG TAGCAGCTTG 

132451 TTCCTGTATT GGGGAGAGCA CATAGTAGAT TGTGGATAAC AGCACTCCTG 

132501 CAAATAAGCT GAGGCCCAGG ATAAAGGAAA CGATGTACCA ggtttggttt 

132551 atgcggacgg tatgttttga agagccttta gacatattct agactcccct 

132601 TTTTCTATAC TTTCTAACAG CAAAATAGTC GATAAGAGGG GCAAATACAT 

132651 TGCCCAGAAG GATCGCTAAC ATCACTCCCT CAGGATACGC AGGATTGATA 

1327 01 AGACGAATCA CAATAGTCAT AAATCCTATA AAGAATCCGT AAATCCATTT 

132751 CCCTAATTTC ATAGTCGGCG ATGATACGGG ATCCGTAGCC ATAAAGACTA 

132801 AACCAAAAGC AAGTCCTCCG AGGAAAAGCT GCCGATAGGC GGGAATGAAG 

132851 AATCGAGCAG GTGCCCAAGC TCCGTTTTGT CCCACGATGA GTACGCTGAT 

132901 AAACTTAAAG AGCCAGCCTG TGAGAAAGGC TCCTATCCCA AAGGCTGCCA 

132951 TGGTTCTCCA AGAGGCAATG CCTGTAACAA TAAGGAATAT TGCACCCAAC 

133001 AGACAGGCGA AAGTGGAGGT CTCCCCCAGA GAACCTATAA TGTTTCCCCA 

13 3051 AAAGAGATTC CCAGCTGAGA ACTTCCCAAT CCCATAGATC ACATCGGTAA 

133101 TAGCATAGGC AGAATCGAAC TGTGTGGGAA GCAGCCCCAA TCCTCCCTCA 

133151 GCAACAGGAG CTGTAACAAA CGTTTGAAGT TGTGTAAGAG TGAGATTATC 

13 3201 TAAAACCCAA CCAGGATGCG TCTCTGTCCA AAGAGAAAAT TGTGAGTGAA 

133251 TGACATCTTG AGTAGGGACG TGAGGAATGT GAAGCATATT TGCAGCAATC 

133301 GCATCGACAT GCAGACGCTT TACAGAGGGA GGTGTCGAAT TTAGAGTTTG 

133351 TAGGCAGGTA GACTGTGAAA ATCCATCAAT GAGTACTTTT CCTGTCGAGG 

133401 AGTTCATCTT CATGAGGCTA TCTTTAATCA CTCCGGGGTT GCTTCCTACC 

1334 51 CAAACGTCAC CACTCATCTT TGCTGGAAAC GTAAAAAATA AGAATGCCCT 

133501 TCCTGATAGA GCAGGATTGA GGATGTTCAT CCCTGTGCCT CCGAAGAGCT 

133 551 CTTTACTGAC AACAATACCA AAGGCGATCC CTAAGGCTGC CATCCAGTAA 
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133 601 GGAATTGTCG GAGGGAGAGT AAGGGGATAG AGGATTCCGG TTACTAGCAG 
133651 TCCTTCTGCG ATTTTATGCC CACGAACTAC AGCAAATAGG ACCTCACAAG 
133701 TACCCCCGAC AACATAGCTA ATCGTAAGTA GAGGAATAAA GATCTTAAGT 
133751 CCTTCCCAAA GGATAGGAAC TATATGGATC TCTTTGTAAA CAAAGGATAA 
133801 ATAACTACCA AATCCAGAAA TATGTAAGAA TTGCTCCATC AGCACAGGAT 
133851 TGCCTGAGCT ATAAACGATA GATTGAAGTC CTGAATTCCA GATCGCAACA 
133901 AAGGTCGCGG GAAACAAAGC GATAACAACA AGCATCATCC AACGCTTAAC 
133951 ATCTACAGAA TCGCGGATGA AAGGAGGCTT GGAAGGGGTT TCAATAGGTT 
134001 CGTAACAAAA TGTATCTATC GCATCGACAA TGGGAGTAAA GCGCTGATAC 

134 051 TTGTCTTGTT GACATAGTTT CCAAAGAGAA TTTATGAATT TTTTGAGCAT 
134101 TGTGATTGAG AGAAAAGTTG AAGCTTCTAC ACGTTCAAAA ACGTAGCAAA 
134151 CTGCTTAAAA TTTTAGGAAT AAAAATTTTC ATGATTCAAA AATAGATGGT 
134201 ACTTTTTTCG TTGCTGTTTC CAAAGTTATG TTATGGCTGT CAAGCTCCAG 
134251 GAGCCTACTT TTGTTCCAAC TGCTTGGAAA AACTTCTCGT AGAAGATAGA 
134 301 GAAGGGCGTT GTCTACATTG TTTTCGTTAT CTTGGTTCTT CCGAAACACG 
134 351 TCTATGTAGC CAGTGTTCAC CCTCTTCACA ACTTCAAGCT TTCAGCTTGT 
134 4 01 ACCTTCCTTC GCAAACGGCC CTCTCGGTAT ATGCTCGTGC TTGTGAAGGT 
134 4 51 AAGCGACCCG CTCTGCAGTT TTTTTCTAAG AGTATCGCCT TTGAGCTAGC 
134 501 TTCACTGGAT GAGACTCCGA GTTGTATTGC CTATATAACA TCGACAATTT 
134 551 CTAGGAAAAT CGTAGTAGAA GTTGCTAAAC TAGAAAAGCT TTTACGCATT 
134 601 CCCTTGTGGC CGTGGCTTCC TAAGAAAAGA CAAATAGAAA AACTTCCTAA 
134 651 AGGGGAAGGT ATCTGCTTTT TGTCGGCCTA TCCTTTATCA CAAAAATGGA 
134701 TGCAAACTAT CGTTGGAGGG AGTGCATCAC CTCTAGTATC TATAAGTCTC 
134751 TTTCTCTCTC AGAATGATCA GTAATTCCTG CAATTGCAAG GTAACCAAGA 
134 801 ATACGTACGC CCTCATCAAC ACACAACGTG TTCGCTTGGA TGTCTCCTTT 
134851 AATGATTGCG CCTCCACGGA GTTCGACTTT TCCAGATACT GTGATATTTC 
134901 CTTCTACAAC CCCTTCAATA ATGGCTTCTT GTAGCTGAAT ATCTGCCTTT 
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134 951 ACCACTCCTT TAGGACCGAT AATAATTTTT CCTTTTGAGA CTAAAATGCC 

13 5001 TTCAAAAGTT CCGTCAATAC GTAGGAGACG TTCAAAAGCA AGTTCTCCTT 

13 5051 TAAAGGTGAC GCCTTCTCCT AAGGTAGTTT CAGGTTCTTC AAGAGGGAGT 

135101 AGAGATTCTG TTCTTGGAGT TGAGGACCAT TGAGGAAGAG AAGATTCTTC 

135151 AGTTAAATTG TGATTCAAAG GGCGAGCTTC CGAAGCTTTA GGGTTGTCAA 

135201 AAAGACTTGG AGGGGTCTCT GGGCGCTCGG ATCTTGAATA TGGCGAGTAG 

135251 CTGGAAGGTG AAGAAGTTTC TTCTTCGTAA AGTGTTTGCA CATCTTCAAA 

13 5301 AGGACCTTTT CCTGTTCTAC GGAACATGGG ACACCCCCTA AATTAACCAA 

135351 CAATATGATT TTTACAGAGA TTTACTTGAC GCTTAAGAGT TTCGTCATTT 

13 5401 TGTAATTTAT GTTCTATAGT TTTACAGGCA TAAAGTACTG TCGAATGAGT 

13 5451 TTTACCAAAA GCAGCTCCTA TTGCAACTAA AGAATCTGTA ATAAGAGTTT 

13 5501 TTGCTAAATA CATAGCAATT TGCCGAGCTA ACACAAGATC TTTAGAGCGT 

13 5551 GAGTTTCCCT TAAGATCATT CAGCTTTACT TGGAATACTG TAGCAACACT 

13 5601 TTTTAAGATC GTTTCTACAG AAATTTTTTG TTTTGTTGGA GAACGGAAGA 

13 5651 GCTCTTTTAG AGTTTCTCGG ACTGTAGTTT CTGTAAGAGA CTTGCCGAAA 

13 57 01 AGACGACAAT AGGCAGTCAG CTTGTTGATA GCTCCTTCCA ATTGACGGAC 

13 5751 ATTGCCATAG ATGTGATCCG CAATATAAAA TGCCATTTCA TTAGGAATGA 

13 5801 GCAATCCTTT TTGCTCCGCC TTGTGCTGTA AAATCGCAAC CCGAGTTTCT 

13 5851 AAATCAGGGA TGCCGACGTG AGCAACCAGT CCCCATTCCA TTCTAGCAAT 

135901 GATACGCTCG GAAAGTTTGA GCTGACTTGG AGGTTTATCA CTGGTAATTA 

13 5951 CAATTTGCTT ACTCAGGTTG ATCAAAGTCT CAAAGGTATT GCAAAACTCT 

136001 TCTTCAAAAT TTTGGCGATT CTGTAAAAAT TGAATATCAT CAACAAGAAG 

13 6051 TAAATCTAGG GAACGATAAA AATTTTTCAT TTTATCAACA GACTTGGATT 

136101 TGAGATGGTA GACAAGATCG TTGATAAACG CTTCTGTAGT GATGCAATGG 

136151 ATGCGTAGAT TTTTATGATG TTCTCTTACG TAGTGACCTA CGGCATGAAG 

136201 TAAATGCGTT TTGCCTAATC CCACACCCCC ATGGATGAAT AAAGGGTTGT 

136251 AGGAGCGGCC AGGTTTCCCA GCAATACCTA CAGCTGCAGA CTTCACAAAT 
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136301 
136351 
136401 
136451 
136501 
136551 
136601 
136651 
136701 
136751 
136801 
136851 
136901 
136951 
137001 
137051 
137101 
137151 
137201 
137251 
137301 
137351 
137401 
137451 
137501 
137551 
137501 



TGATTTGAGG 
CTTTAATTCA 
TTGATTCTTT 
ACTACAAATT 
ACAGAGGTCT 
TGGGGACTTC 
GGAGAAATCC 
ATTTAAAAAC 
TCTCTTTATT 
GATGGCGAGA 
ATGACTCTTC 
GTTAGGAATG 
TCTGCCGATA 
TCGTTGGTTT 
TATGAAATTT 
CTTCAATCAC 
GCAGCATTTT 
ATCATTAGCA 
CAGTAGCTTC 
AAGAGTGTAG 
AGCAAGCGCC 
TTTGATGATC 
AAAAGAATCG 
GATTCTGAGT 
ATTCATTATC 
ATCACATAGT 
TTTTAAGAGC 



GACCTTCAAT 
AAATCTTTAG 
TTGAGAAGCC 
CTAAAGCAGG 
CTTTTGTAGT 
TAAGCGAATT 
AATTTTCAAA 
TGTTCCCAAG 
TATAAAGCTT 
AAATCTCATT 
GCGTTCAGAT 
GATCCTATAC 
AAATAACTCT 
CGTTAGGCAA 
TGTAATAAAA 
GAAATACGTT 
TTTTGGTTGC 
ATTGCTGGTA 
TTCAATTCGT 
GCAGATGATC 
TGACTATTTT 
ATAGATACTT 
CTTTTGAATA 
TCTTCTAACT 
CAACATACAT 
CTATCATCAC 
ACTTTCCCTT 



GAAATTATCA 
TTTCTTCAAA 
ACGGGGGCTG 
CTCTCCATGA 
TATCAAGAAG 
TTCTCTTGAG 
AGCCGTTTTC 
TAGTGCACTC 
TCCCAAATAC 
GCATCTTGAC 
TGGATCCTGC 
ACCCTTCCTC 
ACCGATTACT 
TAAGGAACCA 
ATGGAAAAAG 
TTCTATAGGA 
TTTACTATAA 
TGGGACAGGA 
CCCAAACAGA 
TCCTTGCATG 
CACCAATTTG 
AAAGGATCTA 
ATGCCCTTGG 
CATCATCTCC 
TATCCTTGAA 
ATTTGTAGAC 
GTTGTACTCT 
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AAGCGATAGG 
GACCTCAGAA 
AAGGTTTCTT 
ACATCTAAGG 
ATAATTTTGT 
TTTCTTCAAG 
GAGCAACGTG 
GTTACAGGTT 
AATCGACCCA 
TATTCATTGC 
ATTCCCTGGG 
CGCAGGTAAA 
AGGTTTTAAG 
CAAATTCAGG 
AACTAAAGAA 
GAAAAAATTA 
CTCATCAATA 
TGAAAGGTAG 
AGAGAGCTTT 
CGGAGTGCCT 
GAGATAAAGA 
AGATCACTAG 
CGTAGAAAAG 
CCAGCCTAAG 
CAAATTGAAA 
TTGACTCCAA 
AGGATCGTCA 



AGAGATTCAG 
ATTCCTTCGT 
GTGTTCTGCA 
GGACAAAAGA 
ACAAAAATGT 
AACTTGAATA 
TCTTAACATA 
AACATGCCGC 
TCCCATGAGT 
ACTGAGTGCA 
TGAAGTTCCT 
CGCGGTACGC 
GCAAATTGGA 
AAAAAATAAT 
ATCCGGAGTT 
ACGAACTAAC 
GAGCTTCAGC 
GTGGCAATGG 
TGTTTTATTT 
GATCTAAAAC 
CCTCCAAGAG 
AGCTTCAAAA 
AATATCCTGA 
ATTGCTTTCC 
GATACGAGAG 
GATCTAGAGC 
TCGTCTTCTT 
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139001 GGCTCCTCAG GGACGGCTTG 

139051 GTCCTGTGAA GTGGTTTTTT 

139101 GTAATCACAA AGAAAGTGAT 

139151 TCCTAGATCG AGATCAGCAA 

139201 GAACAAAAGT CGTTTTTTAC 

139251 GTCTCTTTTA TTTCTATATT 

1393 01 CTCGAAATTC CTTGTTTATC 
139351 TATTTCTTTA CGTTTTTTAA 
139401 TGGAAATAGC AGCTCTTCCC 

1394 51 GAAAGTATCA GTCTTTTATC 
139501 TTGTTCTAGG AGTGTTTGCT 
13 9551 AAAATTCAAA TTTGTGAAGG 
139601 ACACGAATTT TGTGTCCGTG 
139651 ACACGACAGT ACGTAAGGGA 
1397 01 GAT ATT AC AA AATTTCACCT 
1397 51 TCATCGTGAT GAGATCATCC 
139801 CCTACGACGA CCTCTCCCTA 
139851 CTGTATCCTT TATTAGATGT 
139901 GAAAGGATAT GCAACAAAGC 
139951 TTACGGACTA CC AACGCTCG 
140001 CTCCATACCT TAAGAGAAAT 
140051 CACAGGCGGG ATGGAGGCGT 
14 0101 GAGAGAGAAA -GCTGTTGCGT 
140151 GTTATCAAAC TGCCTAAAGA 
140201 TGTGATCCAG ACCATTGCAG 
140251 CTAAAGCGCA GGGGGGTAGG 
140301 ATTCTTGCAC TGGCTCAATA 
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TCATTATTTC TTTTTGTAGC TCTGAGGATC 
AAAGAGGCGG AAGCTTCTGG CCTGGGGAAG 
CCAACCTACC TACCAAGAAG TACGAAGAAA 
AACTACGGTG TTTTGAAAAA GCTTCCCAAT 
GTTTATGCTG CTGTCTATGC TTTTGTGGAA 
AATAAGCAGA ACTCGCTGAC GAAATTACGC 
TGTACGCTTG CGTCAGCTTG AGCAGCAAAA 
TTGATAAAAT AGAAAGACCT GATCATTTGA 
GAATACCAAT ATTTGGAATA TCCCTCAGAA 
CTATGAGCTA CCGTAAACGT TCGACTCTAA 
CTTTATGCTC TTCTAGTATT GCGTTATTAT 
AGACCACTGG GCCGCAGAAG CTCTCGGGCA 
ATCCTTTTCG AAGGGGCACC TTTTTTGCTA 
GACAAAGACC TTCAGCAGCC TTTCGCTGTC 
TTGTGCAGAT CCTTTAGCTA TTCCCGAATG 
AAGGGATTCT CCAATTTATT GAGGGGCAGA 
AAGTTAGATA AGAAATCTCG GTATTGTAAG 
TTCTGTCCAT GACCGGCTAT CCCTTTGGTG 
ATCGCTTACC AACAAACGCC CTATTTTTTA 
TATCCTTTTG GGAAGCTCCT TGGACAAGTT 
TAAGGATGAG AAAACAGGAA AAGCCTTTCC 
ACTTTAATCA TATTCTGGAA GGGGACGTTG 
TCTCCTTTGA ACCGTTTAGA TACGAATCGT 
TGGCTCTGAT ATCTACCTTA CGATCAATCC 
AGGAAGAACT CGAACGGGGC GTGCTAGAAG 
CTCATTCTAA TGAACTCCCA AACAGGAGAG 
TCCGTTTTTC GATCCCACAA ATTATAAGGA 
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■ 14 03 51 ATACTTCAAT AACAAAGAGC GCATCGAACA TACGAAGGTA TCTTTTGTGA 

14 04 01 GCGATGTTTT TGAACCCGGG TCGATCATGA AACCTTTGAC TGTGGCGATT 

14 0451 GCTTTACAAG CTAACGAAGA GGCTAGCTTA AAATCGCAGA AAAAGATTTT 

14 0501 TGATCCTGAA GAACCTATCG ATGTGACCAG GACACTCTTC CCTGGACGAA 

14 0551 AAGGATCTCC GCTTAAGGAT ATTTCTAGAA ACTCTCAATT GAATATGTAC 

140601 ATGGCTATCC AGAAATCTTC GAATGTCTAT GTAGCTCAGC TGGCTGACCG 

14 0651 CATCATACAA TCTTTAGGAG TGGCCTGGTA CCAACAGAAG TTGCTAGCTC 

140701 TGGGATTTGG AAGAAAAACA GGGATCGAGC TTCCCAGTGA GGCCTCTGGT 

140751 TTGGTGCCTT CTCCCCATCG TTTCCATATT AATGGTTCCC TGGAATGGTC 

14 0801 CTTATCTACT CCATATTCTT- TGGCTATGGG ATATAATATT TTGGCAACAG 

14 0851 GGATACAAAT GGTTCAAGCC TACGCTATCC TTGCAAACGG AGGTTATGCC 

14 0901 GTCCGGCCCA CTTTAGTAAA AAAGATCGTC TCTGCTTCAG GAGAGGAATA 

140951 TCATCTTCCT ACTAAAGAGA AGACACGACT CTTTTCAGAA GAAATTACTA 

141001 GAGAAGTTGT TCGTGCCATG CGTTTTACAA CGTTACCCGG AGGTTCGGGA 

141051 TTTCGAGCCT CTCCTAAGCA TCACTCTAGT GCTGGGAAAA CAGGAACTAC 

141101 AGAAAAGATG ATTCATGGAA AATATGATAA ACGCCGTCAT ATTGCTTCTT 

141151 TTATAGGTTT TACTCCCGTA GAGAGCTCGG AGGGAAATTT CCCACCTTTA 

141201 GTGATGCTCG TCTCCATAGA TGATCCTGAA TATGGTTTGC GAGCCGACGG 

141251 CACGAAAAAT TATATGGGGG GGCGTTGTGC GGCACCCATT TTTTCTAGGG 

141301 TTGCTGACCG CACACTCCTC TATTTAGGGA TTCTTCCAGA CAAGAAGCTA 

1413 51 AGAAATTGCG ACGAAGAAGC TGCTGCATTA AAGCGTCTCT ATGAAGAATG 
141401 GAATCGTTCT CCGAAACAAG GGGGAACGAG GTGAGGATCT CTATTTCCAT 

1414 51 CTTGCTATAG ACTTTTACCG TTGAGCAAAG ACTCTCTATC AGAGAGCCCG 
141501 TCTCCTCTTT ATCCTCTATG AGTAGTTTAT GTTATGGCTA GGGTAGGTCC 
141551 TAAACTATAG AAATAACTTT AGCTTTCTTC CCCTAAATAA GAGACCAAAG 
141601 TCTTGATGAG ACGGTCTATT GAAGTTTATG GAAGGGGGAG GTAAGGCTGT 
141651 GTGTTTGGGG ATTTAGATTT GGGATAAAGG AGGCTTCTGT TCGTAGAAAC 
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141701 AGGAGAGCGA AATTTTATAT TTCAGAGAAG AGTAAGAACT TTATGGACAG 

141751 TTTTTTGTGA TTGCTTGTAT ACTATCTTGA TTGAATTTTT TGTCGACCTA 

141801 CGAGTAAAGA AATCCTTTAA GCATTTTTTA AAAATCAGAG TGAGAGCATG 

141851 CCCCTAGAGG GCTTTTTATG AAAAAAGTTG TTTTTCAATA GTCCCTGGAG 

141901 CGTAAATGGA TTTAAAAGAG TTACTCCATG GGGTTCAAGC TAAAATCTAT 

141951 GGGAAAGTTC GCCCTCTTGA AGTGCGCAAC TTGACACGTG ATTCCCGTTG 

142001 TGTGAGTGTT GGCGACATTT TTATAGCCCA TAAGGGACAG CGCTACGACG 

142051 GAAATGATTT TGCTGTCGAT GCTTTAGCTA ATGGAGCAAT TGCCATTGCT 

142101 TCTTCACTAT ACAATCCGTT TCTTTCCGTT GTTCAGATCA TCACTCCTAA 

142151 TCTCGAAGAA TTAGAGGCTG AGCTTTCTGC AAAGTATTAC GAATACCCTT 

142201 CAAGTAAGCT CCATACCATT GGGGTGACTG GAACCAATGG GAAAACTACA 

142251 GTTACATGTT TGATTAAAGC TTTATTGGAT AGCTATCAAA AACCTTCAGG 

1423 01 GCTTTTAGGA ACCATAGAGC ATATCTTAGG AGAGGGGGTG ATTAAAGATG 

1423 51 GGTTTACTAC ACCTACACCC GCTCTTTTAC AGAAGTATTT AGCCACTATG 

1424 01 GTACGTCAAA ATAGAGACGC TGTTGTTATG GAAGTCTCTT CTATAGGACT 
1424 51 TGCCTCTGGA AGAGTAGCCT ATACCAATTT TGATACAGCA GTTCTGACTA 
142 501 ATATTACCTT AGATCATCTC GATTTTCATG GCACATTTGA AACCTATGTT 
142551 GCGGCGAAAG CCAAGCTTTT CTCTCTCGTG CCCCCTTCGG GAATGGTTGT 
142 601 TATCAACACA GACTCTCCCT ACGCTTCTCA GTGTATTGAG AGTGCAAAGG 
142651 CACCGGTCAT CACTTATGGT ATAGAGAGTG CTGCTGACTA CCGAGCCACC 
142701 GATATCCAAC TTTCTTCCTC GGGAACAAAG TATACCTTGG TGTACGGGGA 
142751 CCAAAAAATT GCGTGCTCTT CCTCATTTAT TGGAAAGTAC AACGTCTATA 
142801 ACCTACTTGC TGCGATCTCT ACAGTACATG CAAGTTTGCG TTGCGATCTT 
142851 GAAGATTTGC TAGAAAAGAT AGGCTTGTGT CAACCTCCTC CAGGTCGTTT 
142901 GGATCCTGTA CTTATGGGTC CCTGCCCTGT ATATATTGAT TATGCACACA 
142951 CCCCCGATGC TTTAGACAAT GTCTTAACAG GATTGCATGA GTTACTTCCT 
143001 GAGGGGGGAA GACTGATTGT TGTTTTTGGT TGCGGTGGAG ATAGAGATCG 
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1444 01 GTTACTCAAC GAGCTACTTT AGGGCACTTT AATAGGAGGC ATTGTCTAAT 

1444 51 ATGGCTACCA TGACAAAGAA GAAACTAATC AGCACGATCT CACAAGATCA 

144501 CAAAATTCAT CCTAATCACG TACGTACCGT GATTCAGAAT TTTCTAGATA 

144 551 AAATGACCGA CGCCTTGGTT AAAGGTGACA GGCTTGAGTT TAGAGATTTT 

144601 GGTGTGTTGC AAGTAGTAGA AAGAAAACCA AAGGTAGGAC GTAATCCTAA 

144 651 GAATGCAGCA GTCCCCATTC ATATTCCTGC TAGACGCGCT GTAAAGTTTA 

14 4701 CTCCAGGGAA AAGAATGAAG CGCTTGATAG AAACTCCGAA TAAGCATTCT 

1447 51 TAATTCTTGT AGTCTTCTTT GTCTCAGTTG TTAGAGTCAG ACCGGTTTTT 

144801 TACCGGGCTT GACTCTAATT TTTGTTATTA TTATCGTTTG GTGCAATGCT 

144851 TTTCTGATCA AATTGTGCGT GATAATGGGG CTGCAATCCA GGTTACAACA 

14 4901 TTGTATAGAA GTGTCCCAGA ATTCGAACTT TGATTCACAA GTAAAACAGT 

1449 51 TTATCTATGC GTGCC AAGAT AAGACATTAA GGCAGTCTGT ACTCAAGATT 

14 5001 TTCCGCTACC ATCCTTTACT AAAAATTCAT GATATTGCTC GGGCCGTCTA 

14 5051 TCTTTTGATG GCCTTAGAAG AAGGCGAGGA TTTAGGCTTA AGCTTTTTAA 

14 5101 ATGTACAGCA GTACCCTTCA GGTGCTGTAG AACTGTTTTC TTGTGGGGGA 

14 5151 TTTCCTTGGA AAGGATTACC TTATCCTGCA GAACATGCGG AATTTGGCCT 

14 5201 ACTCCTGTTA CAGATCGCAG AGTTTTATGA AGAGAGTCAG GCATACGTCT 

145251 CTAAAATGAG TCATTTTCAA CAGGCACTCT TTGATCACCA AGGGAGCGTC 

14 5301 TTTCCCTCTC TCTGGAGCCA GGAGAACTCT CGACTCCTAA AAGAAAAGAC 

14 5351 AACTCTTAGC CAATCGTTTC TCTTCCAATT AGGAATGCAA ATTCACCCAG 

145401 AATACAGTCT TGAGGATCCT GCACTAGGGT TCTGGATGCA AAGAACGCGT 

14 5451 TCTTCATCCG CTTTTGTAGC CGCTTCAGGA TGTCAAAGTA GCTTGGGAGC 

14 5501 GTATTCCTCA GGGGATGTCG GTGTTATCGC TTATGGACCT TGCTCTGGAG 

14 5551 ACATTAGTGA TTGTTATTAT TTTGGATGTT GTGGAATCGC TAAAGAGTTC 

14 5601 GTGTGCCAAA AATCTCACCA AACTACAGAG ATTTCTTTTC TCACCTCTAC 

14 5651 AGGAAAGCCT CATCCCAGAA ATACGGGATT TTCCTACCTT CGAGATTCCT 

14 5701 ATGTACATCT GCCGATCCGC TGTAAGATCA CTATTTCCGA CAAGCAATAT 
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14 5751 CGCGTGCACG CTGCGTTGGC TGAGGCCACC TCTGCCATGA CGTTTTCTAT 
145801 TTTCTGTAAG GGGAAGAATT GTCAGGTTGT TGACGGCCCT CGCTTGCGCT 
145851 CCTGTTCCCT AGATTCTTAT AAAGGTCCCG GAAACGACAT TATGATTCTT 
145901 GGGGAAAATG ACGCAATCAA CATTGTTTCT GCAAGTCCCT ATATGGAAAT 
145951 TTTTGCTTTG CAAGGCAAAG AAAAATTTTG GAATGCAGAC TTTTTGATTA ■ 
146001 ATATTCCTTA CAAAGAAGAG GGCGTCATGT TAATTTTTGA AAAAAAAGTG 
146051 ACCTCTGAGA AAGGAAGATT CTTTACGAAG ATGAATTAAT TTTGGGTCTG 
146101 TAATTGTGTT TAAGAATTGT TTGTATTAAA ATGATTCTTT TTATACGAGG 
146151 AGAGCACATT CTAATGGAAC TTCTTCCACA CGAAAAACAA GTAGTTGAAT 
14 6201 ATGAAAAGGC TATAGCCGAA TTTAAAGAAA AAAATAAGAA AAATTCTCTC 
146251 TTATCTTCTT CAGAGATTCA GAAATTGGAA AAGCGTTTAG ATAAATTAAA 
146301 AGAAAAGATC TATTCGGATT TGACTCCTTG GGAGCGTGTA CAAATATGTC 
146351 GCCACCCTTC GCGTCCCCGT ACTGTCAACT ATATTGAAGG GATGTGTGAG 
146401 GAGTTTGTCG AGCTTTGTGG AGATCGCACC TTCCGAGATG ATCCCGCAGT 
146451 TGTTGGTGGC TTTGTAAAAA TCCAGGGTCA GCGTTTTGTC CTTATTGGCC 
146501 AAGAAAAGGG ATGCGATACA GCGTCACGCC TTCATAGGAA CTTCGGTATG 
146551 TTATGTCCCG AGGGTTTCAG AAAAGCCCTT CGCTTAGGAA AACTCGCTGA 
146601 AAAGTTTGGC TTGCCTGTGG TCTTTCTTGT CGATACCCCA GGAGCATATC 
14 6651 CTGGATTGAC TGCTGAAGAG AGAGGACAAG GATGGGCAAT TGCCAAAAAT 
14 6701 CTTTTTGAGC TCTCAAGACT TGCCACTCCC GTGATTATTG TCGTTATCGG 
14 67 51 TGAGGGATGT TCAGGTGGAG CTTTGGGCAT GGCTGTAGGT GATTCTGTAG 
14 6801 CTATGTTAGA GCATTCCTAT TATTCTGTAA TTTCCCCAGA AGGATGCGCC 
146851 TCCATTCTTT GGAAAGATCC TAAGAAAAAT AGCGAAGCAG CTTCCATGTT 
14 6901 GAAAATGCAT GGAGAAAACT TAAAACAATT TGGCATTATC GATACTGTTA 
14 6951 TCAAAGAGCC CATTGGGGGA GCTCACCACG ATCCTGCATT GGTATATAGC 
147001 AATGTTCGAG AGTTTATCAT CCAAGAGTGG TTACGATTAA AAGATCTAGC 
147051 TATAGAAGAG CTGTTGGAGA AACGGTACGA AAAATTTCGC TCTATAGGTC 
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14 7101 TTTATGAAAC TACTTCTGAA AGCGGTCCTG AGGCATAAAA ATCATCTCGT 

14 7151 TATATTAGGC TGTTCTCTAC TCGCAATTTT AGGACTTACC TTTTCATCTC 

147201 AGATGGAGAT TTTTTCTTTA GGGATGATTG CTAAAACAGG CCCCGACGCC 

147251 TTTTTACTTT TTGGACGTAA GGAATCTGGA AAACTTGTAA AGGTTTCAGA 

1473 01 ACTAAGTCAG AAAGATATTT TAGAGAATTG GCAGGCAATT AGTAAGGATT 

1473 51 CAGAGACACT TACAGTCTCT GATGCCACGA CATACATCGC CGAACATGGG 
147401 AAAAGCACAG CCTCTCTGAC GAGCAAGCTC TCTAAGTTTG TCCGTAACTA 

1474 51 CATCGATGTG AGCCGCTTTC GAGGACTGGC AATCTTCTTA ATCTGCGTTG 
147 501 CTATTTTTAA AGCAGTCACC TTATTTTTCC AACGTTTCCT TGGGCAAGTC 
147 551 GTTGCTATAC GGGTAAGCCG AGACTTACGT CAGGACTACT TTAAGGCCCT 
14 7 601 ACAACAACTC CCCATGACCT TCTTCCATGA TCATGATATC GGTAATTTAA 
147 651 GTAATCGTGT CATGACAGAT TCTGCAAGCA TTGCCTTAGC AGTAAACTCT 
147701 TTAATGATTA ACTACATTCA AGCCCCAATT ACCTTCATAT TGACATTGGG 
1477 51 AGTCTGTCTG TCGATTTCAT GGAAGTTTTC AATTCTTATT TGTGTTGCCT 
147 801 TTCCTATCTT TATCCTTCCC ATTGTCGTGA TCGCTAGAAA GATCAAAAAT 
147 851 TTAGCAAAAC GTATTCAAAA GAGTCAGGAT TCATTTTCCT CCGTTCTTTA 
147 901 TGATTTTCTT GCTGGGGTTA TGACAGTAAA AGTCTTTCGT ACAGAAAAAT 
147951 TTGCCTTCAC AAAATATTGT GAGCATAACA ATAAGATTTC TGCTTTAGAG 
14 8001 GAGAAAAGTG CTGCTTACGG TTTGCTTCCA CGACCCCTCC TGCATACCAT 
148051 AGCTTCTTTA TTTTTTGCTT TTGTCGTCGT TATCGGAATT TATAAATTTG 
148101 CTATTCCTCC CGAAGAACTT ATCGTATTTT GTGGTTTGCT CTACCTAATC 
148151 TACGACCCTA TTAAGAAGTT CGGGGATGAA AATACCTCCA TCATGAGGGG 
14 8201 ATGTGCTGCT -GCGGAGAGAT TTTATGAAGT CTTGAATCAC CCCGATCTTC 
14 8251 ATAGTCAAAA AGAAAGAGAA ATCGAGT.TCC TTGGACTTTC TAATACAATC 
14 83 01 ACATTCGAGA ATGTTTCCTT CGGCTATCAG GAAGATAAGC ACATCCTCAA 
1483 51 AAATCTAAGC TTTACCTTAC ATAAAGGCGA AGCTCTAGGC ATTGTAGGAC 
14 8401 CTACAGGATC TGGAAAAACA ACACTTGTTA AATTACTTCC TAGGCTCTAC 
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14 9801 CTATGAATCA CGGTCACGTT TTTATTTTTA GAGTATTCTA CTGCTTTTGC 

14 9851 TCCCGCAGTT ATAGGATCTT TCAGGGAGCA AATTAAAGAC TTTCTTAATG 

149901 CTGCTGTTAG AGCATCCACT GTAGCCATAG GAACATATTT CGCAATGGCT 

1499 51 AAACATCCTA AAGGAAGGGG AAAGATGGTC TTACGGCGCC ATAGCTCTCC 

150001 AAAGTCTGCC CGCAATGTCA ATTGGAGATC GTAGCTGAAG CGCTCTTCAT 

150051 GAATCAGAGC GCCTCCATCG ACTTTCCCTT GCAGTATCGC GGATAGAATT 

150101 TTGTCATAAG GCATGGGAAT GAGTTTTGCC TTGGGATAGT AAAGTTTACA 

150151 GAGAGCATGA GCGGTTGTCA TCTCTCCAGG AGTTGCCAAG GTATCTAGAG 

150201 AACATTCAGG ATCTAAGGAG AGGACGATAG GACCGCTGTT GTATCCTAAG 

150251 GTATTTCCTA CGTCCATAAG ATTATAATAA TCAGAAACTA GAGGGAAGAG 

1503 01 CGCTGCTGAC ATTTTCATTA GGGAGAGCCG TCGCTGCAGA GCTAGGGTAT 

150351 TCAAAGTTTC AATATCCGCA ATTGTTACCT GGTTAAGAAG AGGCCTGAAT 

150401 TGGGGGTCTT TTAAGAAAGA ACGAAAAAGG AAAATATCAT TCGGGCAAGG 

150451 AGAAAAGGCA GCAGTCAGTA TCATGTCGGT TGATGTAATA GAGCTATAGC 

150501 GGCTTTGATG TCTTTATTTT CAGGCTTATC CAAAGCTCCT TGGTTTTCCA 

150551 GCCATTCGAA GTAAGACTTA GGAATATCCA CAAGAGGCTG CCCTTTGTAT 

150601 TTGCCAAAAG GCATTTTGAA GACTTTCGGG TGATAGCTCT GTTGCAGCAA 

150651 GTCGAGGACT TGCTGGGGCG GTAAATCACC GATTAAAGAA GTAAATACCT 

150701 TGTGCAATAT CACTACGTCA TCTAGAGCTC GGTGTGCTTG ATTTTCAGCA 

150751 AAACCGTAAA CTTGTCTTAG GTATTGTAAA TTATGTTTTG GTAGATCGGG 

150801 GCGATATTTT TGTGCCCATT TTAGAGAGTC TATTGTACGG TTTGTCAGAG 

150851 GCTCTAAGGA ATGTCTGCGA CATTCCTTAC CGAGTAGGGG GAAATCAAAA 

150901 CCGTCATTAT TATGAGCCAC TAAGATGCTG TCCTCTCCGC AAAATTTCCT 

150951 AAATCCCTCG TAGGCTTCAG GAAATTTGGG AGCAGAAAGT ACCGCATCCG 

151001 TAGTGATTCC ATGAATTTTG GATGCCTCAT CAGGAATGGG AATTTCCGGA 

151051 TTCACATAAG TAAGAAAGGA CTCATCTGTG ACACTATTGT AGGCAGCAAT 

151101 TTCTATAATG CGATCTCTTT CTATTTGTGT TCCTGTGGTC TCCGTATCAT 
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14 9801 CTATGAATCA CGGTCACGTT TTTATTTTTA GAGTATTCTA CTGCTTTTGC 

149851 TCCCGCAGTT ATAGGATCTT TCAGGGAGCA AATTAAAGAC TTTCTTAATG 

149901 CTGCTGTTAG AGCATCCACT GTAGCCATAG GAACATATTT CGCAATGGCT 

149951 AAACATCCTA AAGGAAGGGG AAAGATGGTC TTACGGCGCC ATAGCTCTCC 

150001 AAAGTCTGCC CGCAATGTCA ATTGGAGATC GTAGCTGAAG CGCTCTTCAT 

150051 GAATCAGAGC GCCTCCATCG ACTTTCCCTT GCAGTATCGC GGATAGAATT 

150101 TTGTCATAAG GCATGGGAAT GAGTTTTGCC TTGGGATAGT AAAGTTTACA 

150151. GAGAGCATGA GCGGTTGTCA TCTCTCCAGG AGTTGCCAAG GTATCTAGAG 

150201 AACATTCAGG ATCTAAGGAG AGGACGATAG GACCGCTGTT GTATCCTAAG 

150251 GTATTTCCTA CGTCCATAAG ATTATAATAA TCAGAAACTA GAGGGAAGAG 

150301 CGCTGCTGAC ATTTTCATTA GGGAGAGCCG TCGCTGCAGA GCTAGGGTAT 

1503 51 TCAAAGTTTC AATATCCGCA ATTGTTACCT GGTTAAGAAG AGGCCTGAAT 

150401 TGGGGGTCTT TTAAGAAAGA ACGAAAAAGG AAAATATCAT TCGGGCAAGG 

150451 AGAAAAGGCA GCAGTCAGTA TCATGTCGGT TGATGTAATA GAGCTATAGC 

150501 GGCTTTGATG TCTTTATTTT CAGGtTTATC CAAAGCTCCT TGGTTTTCCA 

150551 GCCATTCGAA GTAAGACTTA GGAATATCCA CAAGAGGCTG CCCTTTGTAT 

150601 TTGCCAAAAG GCATTTTGAA GACTTTCGGG TGATAGCTCT GTTGCAGCAA 

150651 GTCGAGGACT TGCTGGGGCG GTAAATCACC GATTAAAGAA GTAAATACCT 

150701 TGTGCAATAT CACTACGTCA TCTAGAGCTC GGTGTGCTTG ATTTTCAGCA 

150751 AAACCGTAAA CTTGTCTTAG GTATTGTAAA TTATGTTTTG GTAGATCGGG 

150801 GCGATATTTT TGTGCCCATT TTAGAGAGTC TATTGTACGG TTTGTCAGAG 

150851 GCTCTAAGGA ATGTCTGCGA CATTCCTTAC CGAGTAGGGG GAAATCAAAA 

150901 CCGTCATTAT TATGAGCCAC TAAGATGCTG TCCTCTCCGC AAAATTTCCT 

150951 AAATCCCTCG TAGGCTTCAG GAAATTTGGG AGCAGAAAGT ACCGCATCCG 

151001 TAGTGATTCC ATGAATTTTG GATGCCTCAT CAGGAATGGG AATTTCCGGA 

151051 TTCACATAAG TAAGAAAGGA CTCATCTGTG ACACTATTGT AGGCAGCAAT 

151101 TTCTATAATG CGATCTCTTT CTATTTGTGT TCCTGTGGTC TCCGTATCAT 
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151151 AGAAAATAAG AACATCCATA GTTTGACTAC TCATTACGTT TTTCTTGTTG 
151201 CTCTTGAAGA GCCTGACGTC TTAGTTCATC CAAATTCATA TTCCCAGAAG 
151251 AGATCAACCC AATAGCATGA GAAAAACTAT CACAGACTAG CTTTATTGTA 
151301 TCGATATATA TCCGTAATAG TGTGTCATGA ATTTCTCCGT TTAGGCAGGG 
151351 CAACACAAGC CGATAAAATA TCAATCCCTG TTCTTCATCC ATGCCAAAGC 
151401 CGGGAATATC AATGTCCCTA TTTAAGAGAT GGAGTAAACG AGCTGTTGAT 
151451 GCCTTATGAG ATTCATGCAA TTGGTAGGGA AGGTAACAAA TCAACTGCAG 
151501 TATTTCTCCC TCACTGCGGA TTACAAAAAA TAAAGGGAGT TCATTGCCAT 
151551 TAGCTTGAAT GTTAATGTAA. GTAAGACCGC TTTCTCTTTC TAAGAAAGGT 
151601 TCTTCATCCG AACtTTTAAG AAATTTTGTG AGATTATTTT GATTTAATGT 
151651 CCATGTCGTC ATTTAGGAAA TACTCCAAGT TGTTCCTAGA GCCTGCATCA 
1517 01 TTGCTGGCTG ATAATACTAG ATCTAATCTT GATTCGTCTG TTGTTTTTTT 
151751 TGTATCTTTT GTATTGCTTT ACTGAGGAAA TCAGAGGTTG CAGAGGAAGA 
151801 AGACTGCTTA TTATTTTCTT GAAGTAGATG ATTGATATCA GTGTCGGGAG 
151851 GAAACTCTGT CAGTAGCGTA ATTTTTTCTG GTGTTTCTGC AATCACAAGA 
151901 GTTTTATTCA CAACTCGAAT GAGGTAAATA GAAGTTTTCG GCGTTAGGGA 
151951 ACGTCGTTCT AGGATTTTGA TTTGAGACGA GCCTCCAAAA CCGTGACTTC 
152001 TTGATCTCAC AAACTTTTTA AACGCCCAAA CTCCAAAGCC AAAAATTGTT 
152051 AAAAGTAGAA TCAAAGATCC TAGCATTTTA AACATTTCTA ATTTCATGCT 
152101 TCCTGGGAAC ATTTCATGTA CAGAAATGGG CTCTTGGATC GTTTCTGCAA 
152151 GAGCAAGCTC ATCAGAAAGC TTAAAAACTA AAGAAAAAAG ATTAAAAAAC 
152201 ATGTGTAAGA CCCGCGATCA TCTCTATAAA ATTATAGTGG TAGCCCGATT 
152251 TGTATCCAAC TACATACAAG TAATAATGAA GTATAGTTTT AGTCGATGCT 
152301 ATATAAATTA TAGTACAATG ATTTCCAAGT ACAGATTAAA CCGTAATCAT 
152351 GTATATTCCT GCAAGTACCG TCTAGAGAGC TCCCCTAGAT GATTTGGGTA 
152401 ATTCACAGAC TCCTTTATAC CCTTCTAGGG TGTGCTCGTT CCACAGAGCC 
152451 CAAGCTCTTG TCTTTCATAG ACAAAACGAC AGCAGTCTGT CGGTGGATGC 
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152 501 AAAGAATTTT TTCAAGTTGC CTGAGAAATT CCTTGGCAGC GCTTAATAAA 

152 551 ACTATGGTGA TGCTATGGAA AAGTTACTAG TGACTGATAT TGACGGTACA 

152601 ATTACCCATC AATCTCATCA TTTAGATAAA AAGGTGTATG AGCGGCTCTA 

152 651 TGCGCTGCAC CAAGCTGGTT GGAAGTTGTT TTTCTTGACG GGAAGGTATT 

1527 01 ATAAATATGC TGCACGCTTG TTTTCTGATT TTGATGCTCC ATATTTATTA 

152751 GGATGCCAAA ACGGCGCTTC TGTATGGTCT TCAACATCAT CAAATCTTCT 

152 801 CTATTCTAAA AGTTTACCCT CAGATTTATT ATGTATTTTA CAAGATTGTA 
152851 TGGAGGGGGC AACGGCTCTT TTTTCCGTGG AATCAGGAGC TCCTTACGGG 
1529 01 GATCACTACT ATCGCTTTTC ACCGACTCCT ATAGCTCAAG ATTTACACGA 
152951 ATATGTAGAT CCTAGGTACT TTCCTAATGC TAAGGAAAGA GAGATCCTAT 

153 001 TTGAAACGCG CTCTTTAAAA GACGACTATG CTTTTCCTAG TTTTGCTGCA 
153 051 GCAAAAGTCT TTGGACTGCG AGATGAGGTC ATCAGAATTC AAAAGGAGCT 
153101 GGAACGCCAA GAAGCACTGA CTTCAGTCGC GACGATGACG TTAATGCGCT 
153151 GGCCCTTTGA CTTTCGCTAT GCCATCTTGT TTTTAACAGA TAAAAGCGTC 
153201 TCTAAAGGCA AAGCCTTAGA TCGTGTTGTC AATATACTTT ATGATGGAAA 
1532 51 GAAACCCTTT GTCATGGCTT CAGGAGATGA TGCTAATGAT CTCGATCTTA 
153301 TTGAGAGAGG AGATTTTAAA ATTGTGATGA GTTCCGCACC TGAAGAGATG 
153 3 51 CACGTTCATG CGGACTTTCT AGCTCCCCCA GCAGATAAGA ATGGCATTCT 
1534 01 TTCAGCTTGG GAAGCTGGTG TCCGCTATTA TGACGACCTT ATGAGTCTTT 
153451 AGGGAACATC TCAGGACCAA TTCCCATCAC ATTGGCTCCG TGATCTACGT 
153 501 ATAAGGTCTC ACCAGTAATT GCTGAAGCTA GAGGTGATGC TAAGAAAGCT 
153551 GCAACGGCAC CCACCTGCTC GGCATTCATA GCCTCGGGAA TAGGCGCCCA 
153 601 CTCTTGGTAA TAGTCTACCA TTCTTTCAAT AAAACCAATT GCTTTTCCAG 
153 651 CTCGGCTTGC TAAAGGTCCT GCAGAGATGG TATTGACACG TATGCCCCAA 
153701 CGGCGTCCCG CTTCCCAAGC AAGAGTTTTG GTGTCACTTT CCAAAGCTGC 
1537 51 TTTTGCCGAA CTCATGCCCC CTCCGTATCC AGGAACAGCG CGCATAGAAG 
153801 CCAAATAGGT GAGCGATATT GTCGATCCAC CACGGTTCAT GATACTTCCA 
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153851 AAGTGAGAGA GAAGGCTAAC AAAAGAATAA CTAGAGGCAC TGAGAGCCGC 

153901 TAAGTAACCT TTTCTTGATG TTTCTAATAG AGACTTAGAA ATTTCAGGAC 

153951 TATTTGCCAG CGAGTGGACA AGAATGTCAA TATGACCAAA ATCTTTTTTT 

154 001 ACCTGTTCTG CGACTTCTGA TATCGTGAAT CCCGTAATGC CCTTGTAACG 

154 051 TTTATTTTCA GCAATATCTT CAGGAACATC TTCAGGGCTA TCAAAACTTG 

154101 CGTCCATGGG ATAGATCTTA GCAATCTCTA AGAGAGTGCC ATTCGATAAT 

154151 TTTCTAGATT CATTGAATTT TCCTAATTCC CAAGACTGAG AGAAAATTTT 

154 201 GTAAATCGGT ACCCATGTTC CTACAATAAT CGTAGCTCCT GCTTCTGCAA 

154 2 51 GAAGTTTAGC AATACCCCAG CCATATCCTT GGTCATCACC AATGCCCGCA 

1543 01 ACAAATGCTA CCTTTCCTGT TAGATCAATC TTTAGCATGA ATCCGCCTTA 
154 3 51 TACTTTTGAA GCTTATTGGA AGGAGAGTAA CAAATCTTTC GATTATTAAG 

1544 01 AAAACCTTTT GGTGCCTCAA CAGGGGAGAT CCTGCCTCCA ATGTAAATAG 
1544 51 AAACGTAAAT TCTTTAAATT TTTTTCTTTA CATATTTTAT AGAATATCCA 
154 501 AACTTCTCAC TCCCGCGTAC TGCTAAAAAA ATTTTCAAAA GAATTTACGA 
154 551 TCCGAACTTA TCGTAGTTTG GGTTTCACTG ATTACTTAGG AGGTTGTTTG 
154 601 ACGAATCCTT TAGGGAAATT CCCCTCACCA CAGAJ^TCCAC AGGTTGTTAC 
154 6 51 GATAGCGCCT TCTTCCACAA CACCACAAGC AGTCTCATCT GCAGTTCAAG 
154 7 01 GTTTTCTTCA AACTGGAGGA GCTGCCTCCT CTACAGCGAC AACTACTACC 
1547 51 GCATCCGGAG CCTCTGCATT AGGACTTTCA -CCTGATCAAG TGCAAGCGTT 
154 801 GCTTACTAAT TTATTAAATG TGGGACAACC ATCAGTGGGA CAACCATCAA 
154851 CTTCAGCAGG AACTTCGGGA GCCTCCTCTT CCAGTGCAAG TATGCAGCAA 
154901 CAGCTTTTGC AACTTATCTT AGACAAGACA ACAGGAAGTG GCGGATCGTC 
154 951 CGTGAGTTCA GAGCAATTAC AGCAACTCCT TAGCTTGGTG AGCCAGATGA 
155001 CTACGTCTCA AGGAGGAAGT GGTGGAACTC AGGCAGGACA GGCCGCTTCG 
155051 GTACTGTTGA ATTTGTTATC GGCAACAGGA TCTGCAGCAG CAAATCCTTT 
155101 AGGGACAGCT GCATCGTTGG CACAGATCAT TTATGCAGCA GTAACAAGTC 
155151 CTGGAGCAAA GAAAACTAGC GAATTTTGTT ATAATTATTG TGGAGAGACC 
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155201 
155251 
155301 
155351 
155401 
155451 
155501 
155551 
155601 
155651 
155701 
155751 
155801 
155851 
155901 
155951 
156001 
156051 
156101 
156151 
156201 
156251 
156301 
156351 
156401 
156451 
156501 



TGCCAAGGCA 
CGGTTGTGGA 
GCGGGATAGG 
GAGGCTTTAG 
TGAGCTTGGC 
GATTTCCTCC 
GACTTTTGTG 
GGCGAGTTGT 
TTGCCTCAGG 
GAAAGCAAAG 
TTGTAGCCCT 
AGGTCTTCTC 
TCTGCATCCT 
TGGAGGAAGC 
TTGCTTTGGT 
GATCCCCAAC 
GGCTCGGGGG 
AATCTTCCTC 
TCTCTTACGA 
AGATAAGTGC 
TAAATTTACA 
CCGCAACCTC 
TGCTAGAGAG 
TACCTTTAAG 
CGGGATTTAG 
CGACTCCAAT 
TTTCTTGAAG 



ACTGCGGTTG 
GGATTTGGCC 
AGAGGGATCC 
AACAGAAATA 
ATTGATACCA 
AATCGCGGAG 
AGATCTTGAA 
GTGGATGGTT 
GATTGCTAAG 
TGATGGTTCT 
TTTAACCTAG 
CCTTAAAAAG 
GCATGAATCT 
ACTCCGATTT 
TTTATCTCAC 
GTGGCAACAT 
ATGGGCATGC 
GTCACGCTAT 
AAACAAGTTT 
CTACTACAGA 
ACGCCCCCCT 
AACCAGATCC 
CGTTCCCCTG 
TGTGATTTCT 
AAGATGAAGA 
TACAGTTTTA 
TAAGTGCCGA 



TCCTACCTGT 
GTTTTTTCTG 
CAAGAACCCG 
TGGCAAGGCT 
TGAGCTTATT 
GTCATGGCTG 
GTCTCAAAGC 
TATTACAGGA 
TCTTCTCTTC 
TTCTTCATGG 
AGAGGATATG 
AACGGGTGCG 
ATTAATGTCT 
GGATCACCAA 
TATCAATGTT 
TTTAGGTAAT 
GTGTCGATCT 
TTAGAATTAG 
GTTAAGCGAT 
TGTCAACTTC 
GTACCCACAC 
TGTGGTTACG 
TGTCTTCTAG 
CCTAGGTCGC 
AGAGGAGGTT 
TGAATCTCTA 
GCACATTCTC 
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GGCTGTCCAG 
TGGTGTATGG 
CAATCCCTTT 
GTTCTTTTAA 
ATCAGGACAT 
CATGTGACCG 
ATGGATCTGT 
TCCTTTTTGG 
AGGAAACGGA 
GGAGAGCAAG 
TATGTCTTTC 
AGAACATGGG 
ATTTTCTTTG 
AGAAAATCTG 
ATTTTGTCCC 
CCAGAAGTCA 
GGAAAGGAAG 
CTGCACGATG 
GCTAACAATG 
ATTGATGCAT 
CTTCTGGAGT 
TCTCAACCTT 
AGGGCGTTTT 
ACCCCGGAAG 
ATGTTTTGAA 
ATTGTAAAGT 
TAGGATCTTC 



ACGGACAGTG 
AAAAATTGTT 
ATAAGAACTC 
TTGCGTTAAG 
CGACTCGAGG 
GTGTTCTATG 
GGGCGGATGC 
AGTACAGCAA 
ATTCGAGTGT 
GAGCACAGGT 
CCATCACTTA 
AATCCAGTTG 
TAGCTACCAA 
ATGGCGTTAG 
AGCCACAGGA 
ATGCTATTTT 
CGAGGGGGAG 
TTTTGAGAAT 
TTCAAGAAAG 
ACGGCGGGAC 
CACGGCACAT 
CTTTATTAGG 
CCTGTAGTTT 
GGTAGAAAGG 
GCAGTGTAAA 
TCTAGGGGTT 
GGTTGATGAC 
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156551 GCACAAATTT 

156601 GTGAGCATGG 

156651 CGTAGTCGAC 

156701 CGGATTTGAT 

156751 GCTCCACACC 

156801 TTTTTTTTTA 

156851 TTCTTTGCTT 

156901 CCCATGAGAA 

156951 AAACCGTGTC 

157 001 CGTAGATAAT 

157051 GTTGAGGGGA 

157101 GGATTCAAAT 

157151 TAGACGCCAA 

157201 AAAGAAGATA 

157251 AAAATATTGT 

157301 AAGAGATTTT 

157351 TCCTGTAGAG 

1574 01 CTTGGAGAGA 

1574 51 CTACAGCAAA 

157501 GGAGAGATTT 

157551 TCATTAAGTT 

157 601 TTAGAGGGAG 

157651 TGATGCCATT 

157701 ATGCTATTCT 

157751 AATGTCTTGC 

157801 CTTAGAATCA 

157851 CTCATAAGAG 



TTAGGGGAAG ATTTGTGAAT 
AGGAGAGGGC GGAAGATCTG 
ATCTCCGACA ATAGGATGAC 
GGGTTCTCCC TGTGATGGGC 
TCCGGTATAC GGGGGCCGTA 
GGATGACCAA AAACGAAAGC 
GAACAATTTC ATGAGCTCAG 
GACACCCAGA GGTGCCTTTG 
ATGTGTGCCA TTTGTTCAGT 
GCTATAGTCA' TCCTCCCAGA 
TCAGAGATAG GGAAACACGG 
CTTTCTATGA ACCCGTTCAC 
GATTTCTTGC TTGCTATGAT 
ATCTTGAGAC TTGTGTGGCA 
AAAGCAGCCC TTTTATCATT 
GTATGACAAA GATAGCTTTT 
GCATTAAAAA AATGGTTTGA 
TAACCCGACT CCCTATAGTG 
CGCGAGCTGA AGTTGTTATA 
CCTACCATAG AGTCTTTAGC 
ATGGGAGGGA ttgggttatt 
ctcgcatggt tatggaggag 

TCCTTAGCTC AAATTCGTGG 
AGCCTTTGCT TTTAAGAGGC 
GTGTTCTTAG ccggatattt 
actcgtactt gggtttctag 
tcccgaggtt atagctgagg 
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GGGGAGATAA ATTCTAGGGA 
GGGAGGCTGT TCTTTAGGTC 
CCAGCAATCC CATTTGTAAG 
CTGCGGCTCC AAAAATCACA 
TAAGATTTTA CGGTTCCAAA 
TATGTATTGT TTATGGATTT 
TAGCCGCTTG TTTAGACTTT 
TCTAACCTAT GCACAGTAAA 
AGTAAGATGG GGAGGTTTTT 
GGATGCTAGG TTGTTGTTTT 
TCGCCAGGTT GTACCTTGTA 
TCGACATCGA TGTTGGCGAA 
TAGGCAGTTG AGATCTAAGA 
AGCCAGGAAA AATTTTCCAT 
TGATAATTGC ATAAAATTTT 
TCTGAAAAGG CAAAGAATTT 
AAAAAATAAA CGATCTCTTC 
TGTGGGTTTC CGAAGTTATG 
GATTATTTTA ATCAGTGGAT 
TGCAGCAAAA GAAGAAGATG 
ATTCTCGAGC GCGCCATCTT 
TTTCATGGAA AGATCCCTGA 
AGTTGGTCCT TATACGGTTC 
GTGCTGCTGC TGTGGATGGC 
TTGATAGAAA CTTCTATAGA 
GATTGCTCAA GCGCTTCTTC 
CTCTGATAGA GTTGGGAGCT 
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157 901 TGTATCTGTA AAAAAGTTCC TCAATGTCAT CGTTGTCCTG TCCGTCAAGC 

157951 ATGTGGAGCT TGGAGGGAGA ACAAACAGTT CGTATTGCCG GTACGTCATG 

158001 CCAGAAAAAA GGTCATCTTT TTGCATCGTT TGGTAGCGAT TGTATTGTAC 

158051 GATGGCTCTT TGGTTGTCGA GAAGAGACGT CCTAAAGAAA TGATGGCAGG 

158101 CTTATATGAA TTTCCTTATA TTGAAGTTGA ACCAGAGGAA GGTCTTCAAG 

158151 ATATAGAAGG ATTTACTAAG AAGATGGAGC TTTCTTTAGA AAGCCCTTTG 

1582 01 GAATTCTTAG GTAACCTTAA AGAACAGCGG CATGCGTTTA CTAATCATAA 

158251 GGTTCATTTG TGTCCTATAA TTTTTAAAGC CACTTCTCTG CCTCAGTTCG 

158301 GGGAATTGCA TCTTTTGAGT GATATAGATC ACTTAGCTTT TTCTTCAGGA 

158351 CACAAAAAGA TTAAAGATGC TTTGCTAATC TACCTCGGGG ATGTCAGGTC 

158401 TAGAGAATCA ATAGGAGTAT AGATGCGAGA TCACGCTTTT TCTAAATTGA 

1584 51 TAGGGACTGT CCGTGCCATG GTAGTTGAAG GACGTTGTCC TTGGTCACTT 
158501 CAGCAATCCC TAGTCTCTAT GGTAGAGCAT ATTCTTGGAG AGTGTCAGGA 

1585 51 ATTTCACGAG GCCGTCTTAC AAGGTAAGAC GGTACAAGAG GTTGGTTCCG 
158601 AAGCCGGGGA TGTCTTAACT TTAGTTCTAA TTTTATGTTT TCTGTTAGAA 
158651 CGAGAGGGCG TACTTGCTTC CGAAGACGTT GCCAATGAGG CTATGGAAAA 
158701 ATTGCGTCGC CGTGCTCCTT ATATATTCGC TGAAGATTAC AAGCCGGTCT 
158751 CGATTGAAGA GGCCGATCGC CTTTGGGAGC TTGCTAAGCA CCGAGAGAAA 
158801 AATGAATCTA CATAGTTGAA GTTTTGGTCT ATTTTTAAGC ATATGGTGCT 
158851 TTTGAAAAAA CAGAATATAT GCTATCAAAG AAGGGTAAGT TGGGGGCCTT 
158901 TTAAGAGAAG GAACCTGCGA ATCGGGTCAG GACTGGAAGG TAGCAGCCCT 
158951 AAGGAGAGTT TTCTTTTGCT AAAAGAATGT TCTCCAACTT ACTCTTTTTA 
159001 CTTTATTCCC AAAAATAGCA ATGAGGTGAG GTTAAACAAC CCGTGCAGTG 
159051 CAATGGGAGA AAGAATGTGC CGATCTTTTT CATATAGAAA CCCTGCAGAT 
159101 AAGGAAAAAA CAAAGAGCAC GGGGACAAAG ACCCAACTTC CTAAAGAGTG 
159151 TTCAATGTGA ATGAAAGAGA AAATAATAGA AGAGCATAGT ACCGCAGCTA 
159201 TGCGCGTCAT TTTGTTTTTC AAGAATGTCT GTAGAATTCC TCTAAAAAAT 
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159251 ACCTCTTCTC CAAATGGAGT 

159301 GTAGTGTCCT GTTATAGGCA 

1593 51 GTGTGTGAAT CTCTTGCGTA 

1594 01 ATAATCCCAA TCAGTTGTGT 
1594 51 GGCAGATCCT AGAGCACGCC 
159 501 AAAGTATAGC ACGTGTGATA 
159551 AATGCAAAGG CAAGGCTAAT 
159601 TTGTGAGCTC ACGCTAAGAG 
159651 CACCAAATAA AACTTGGCGG 

1597 01 GGCCAGATAA AGAAGTTTTT 
159751 GAGAAGAATA AACTTGGACA 

1598 01 CATAGCCCTC AACTCTGGCA 
159851 AAGCCCATGG GCGTTGAATC 
159901 ATAATCATTG ACTAGGGTAG 
159951 GATAAATGAG AGCTATTTTA 
160001 ACAGACGCTG TTACAGAATC 
160051 TAACAGTTCT GTAGCAACAA 
160101 TTGTTTCGGG TAGACGATTC 
160151 TGAGGTGAAG CATTGTGCTT 
160201 CCTTTTGCTG ATCTCTTCAG 
160251 GTTCAAGACT CTTTCCTGGG 

1603 01 CCTTTAGTTT TGCTCCCTGA 
160351 AGAGCCTAAG TGAGTGCAGC 

1604 01 ATGCCAATAA TACCAACATT 
160451 ATTTTGCAAA AGATTCTAGC 
160501 ACTTTCGCAA TTGAGAGATT 
160551 AAAGAAAAAA ATAAAAACAA 
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GAGGACGCCT AAATTTAGAA TCATGCTAAT 
GAGAGTTCTG AACTTCTTGA GTGACTTCTT 
GGAAGAACCA AAGTTAAAAA TTTACTCATC 
TACTGGGATG ATGATGATCC ACATTCTGAT 
ATGAAGTTTT AACCGGTCTT TCTCCAGAGA 
TCCTTGGGGA GAAAAAGCAG GTAGAACAGA 
TCCTGTCATG GTGGAAAGTA ATTCCGCAGT 
CTACAAGGGA AGAAAAAACA AGAAGAGCAC 
AGCTTTAAAG GTGTTTTCCC AGAGGGTGCT 
GGAAGCTAGA GCAGCGACGC CAAGGGACAA 
TTTCCTTAGA CTACGAGTAG TTAGCACAAA 
ACAACTTCGC GGAAAAGACG GCTATGCATT 
AAAATGTTTT GAGTTCCAGC CATAGCGATG 
TTAAAGGCTG GCTGCATTCG ATAATCTCTT 
TGATGACGGA TATCAAAAAC GCGAACACGT 
GACACCTGCT TCTTTCCCTG TCTTTTGTTC 
TGAATTCTGC AGGAAGAAAT TGCTCAATAA 
GCAATCGGAG CATAGAACTG AGAGACTGTC 
GATCAGGAAG ACCTTTTCCG AAGCATAAAA 
TAAATTCTCC TTGGAGGTTC CAAGGTAAAG 
CGATGAAATA CAGGAAGCAT CGCAATCACA 
AGTGTATAGC TTAGGATGAT AACTTCCTGA 
TGGATAGGGT TGGGGATAGA AGTCCTAAAG 
TTTCGCATAG TCACTGTCCT TAAATTGCTT 
CCTGGGAAAG TTTTTACTTT TAAGATCAAT 
TTCCATTTAA AACTCTCATT AGCTTATATC 
GCAAAGAGAC CGTCTCAGTT TTAGTTTAGA 
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160601 AACTCAAGGT TGAGAAAGGG ATTCTGACCA AAGTTGTGAG GGAACTTTGG 

160651 TAACTTTTTC TTTAGGAATC AATGTGCACC CTGGGAGGAA TACAACTGGA 

160701 GCTGAGATCA CAGAAAAAAT AAACCACTTC ATGACTACCT CTGCAAAAAG 

160751 AATACTACTA TTTTCTATTT GCATAGGCAG TTCTTCGATT TATAGCAATT 

160801 TTTACTTTAT CATCATAAAA ACTATGATGA AAAGGTTCTT AGTGAACTTC 

160851 TAAGGAACAC CATGAGTTTG TGATTAAAAC TCATGGTGCG TAGTTGAACC 

160901 CTTATTGAGG AGGGACGCAA CCACTAAGAG CTAACAATAC TAAACAAGAT 

160951 AATAGTAAAG AGAATGCTTT TTTCATTATT CATCCTTAGG TTAATTGACC 

161001 TTTCCAGCCT AGCTCTAGGG CGATTATTTA TCAAATTTTT CTTTGTAATT 

161051 AATGATCATG CGACCATTAA TTTAGCGATA AATTATGATT TCGTCAGGAA 

161101 AATTCAATTC TTTATAATAA TGATATGAAA TTAGAGAATG TCTATAGGGG 

161151 CGGACTCTAT TTGTGATCCA GGATCTCTTT AGGAGCACTT TGTGGATTTT 

161201 GATTATTTTG GTCTGAGTGA TATTGGTAGG GTGCGCGCTA GAAATGAAGA 

161251 TTTTTGGCAG GTAAACCTCA TGTCTCAAGT GGTTGCTATT GCTGACGGTG 

161301 TTGGGGGGCG TCTTGGTGGA GACATTGCTT CTCAAGAGGC AGTGACTAGC 

161351 CTTATGGAGC TGATTGATGA GCAACAGTCA AAATTGATGG GGTATGGGGA 

1614 01 TGACCAGTAT AAGGAGACTT TAAAAAAGAT CCTTTTAGAG GTCAATGGTG 

1614 51 TGGTCTATGA ACACGGCCAA ATGGAAGAGC ATCTCCAGGG TATGGGAACC 

161501 ACTCTTAGCT TCATCCAATT CCGGAAGGAT AGGGCATGGC TATTTCATGT 

161551 GGGAGATAGT CGAATTTATC GTATTCGTGA GGGAGAACTG CGCCGCCTTA 

161601 CCGAAGACCA TTCTTTAGAA AATCAATTAA AAAATCGTTA TGGGCTTCCT 

161651 AAACAATCAG ATAAGGTGTA TTCTTATCGC CATATTCTGA CTAATGTTTT 

161701 GGGAAGTCGT CCCTATGTCA TGCCTGACAT TCGGAATCTT CCTTGTGAAA 

161751 AGGAAGATTT GTACTGCCTC TGTTCGGATG GATTGACAAA CATGGTTCCA 

161801 GATATCGATA TTCGTGATAT CTTGAACCAG CCCGCCACCC TAGAAGAACG 

161851 GGGGAATGCA TTAATTTCTC TAGCCAATAC TCGTGGAGGC GATGACAACG 

161901 CTACTGTCGT ATTAGTCCGA ATACAATAGT TCCTTTGCTA AGGATAGTAT 
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161951 TCCATGATCT ATTTGGATAA CAATGCGATG ACACCCCCAG AGAGGGGACT 

162001 TTTGGAATTT CTCCAAAAAA CCTTCCTTAT AGAAGGGACG TACGCGAATC 

162051 CTTCGAGCGT CCATCAATTA GGTAAAAAAT CTCGTCAACT GGTTCTAGAA 

162101 GCTTCACACT GGATGCAAAA GGTCCTTTCG TTTCAGGGCC GTGTCCTCTA 

162151 TACCTCAGGG GCTACTGAGA GTTTAAATTT AGCAATAGCA AGCCTCCCTA 

162201 AAGACAGTCA TGTTATCACC TCAGGTAGCG AACACCCCGC CATCTTAGAG 

162251 CCTTTAAAAC ATTCCTCGCT TTCCGTTTCT TATTTAAATC CCGAAGAAGG 

162301 GAGATGTGTT CTTACTATAG AGCAGATTGA AAGAGCTGTG ACTCCTAAAA 

1623 51 CTTCAGCAAT CATCTTAGGT TGGGTCAATA GTGAGACTGG TGCCAAAGCT 

1624 01 GATATAGCTG CTATAGCCCA CTTCGCGCAA GAACGACAAT TGCAATTTAT 
1624 51 TGTGGATGCG ACTGCAAATG TAGGTAAGGA GAGGATAGTT CTTCCCTCTG 
162501 GTGTCACTAT GGCAGCATTC AGTGGACATA AATTTCATGC ACTCTCTGGA 
162551 ATCGGAGCTC TTCTGGTCTC TCCAGGAGTC AAACTACATC CTCAGCTGTG 
162601 GGGAGGAGGT CAGCAAGGAG GGCTGCGCGC AGGCACAGAA AATCTTTGGG 
162651 GAATCGCCTC TCTGCTTTAT ATTTTCAAAT ACCTAGATCT TCATCAAGAG 
162701 CGTATCTCTC AGGAAATTCT TACCCATAGA AATGGTTTTG AAAAGGCAAT 
1627 51 CAAAGCACGC ATTCCTGATG TCCATATTCA TTGTGCGGAT CAACCACGGG 
162801 CAAACAACGT CTCAGCAATT GCTTTCCCTC CGTTGGAAGG TGAGGTATTG 
162851 CAAATCGCCT TAG AT AT AG A AGGAGTGGCT TGTGGTTATG GATCCGCATG 
162901 CTCTTCAGGT GCTACCGCAC CCTTTAAATC TCTTGTCAGC ATGGGTGTTG 
162951 ATGAAGAGTT GACCCTGGCA ACACTCAGGT TTTCTTTTAG CCATCTTCTC 
163 001 TTGCAAGAAG ATGTTGAAAG AGCCGTTGGA ATTATAGAAA AAGTCGTAGA 
163051 ACGTTTGAAA AATTCCTAAG TCTTAAAAGA GAACATGTTT CTAAGCTGAA 
163101 AGAACACTCC TGACTCTTAT TGCAGAATCT ATGAGAGTAA GTTTTTAATC 
163151 GATACGGTTT TTATCCCAGA TAAAGACATC TCTTTAACTT CTAAAAGCAA 
163201 GTTGTTGATA ATTACAGAAG TTCCTACTTC TGCAGGACTG TCTAGCAGTT 
163251 GCAATACCAA TTGGGCTAGG GTTTCTACAG GATATTGCGG AAATTGAATA 
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163301 TCGAGTTCTT TTTGCAGATC TTTTATGCGA GAGTTGCCAG GAAACGTTCT 

163 3 51 TTCAATAACA GAGATGGTCT TGGGTTTTAA ATGAGCAATG TTTGTAGTGT 

1634 01 TGAATAAGAT TTTGAAAATT GCATTTAAAC TAAGAATACC TATAGGTTCA 

1634 51 CCAGAAGCAT TGAGGACAAC AGCAACACTC GAACGGTTGT CTCGAAACTC 

163501 TTTGAGGATA CGAATAAGTT TTGATTTTGC AGTGATAAAC CAAGGCGAGT 

163 551 GTAGATTATT GATTAGGGGT TCATCAAGAG CTTTATTGAC AAAGTCTTTA 

163 601 GGATGGGCAA TCCCAATAAC GTTTTTTCGG GCCTTGTGAT AGACAGGAAT 

163 651 AAAGTTGATA TCTGTATTTT TTATAGTCCG GCAAAAATCT TTAACATTTG 

1637 01 CAGAAGAAGG AAGCATGGTA ACCTGTTCTA AAGGTTGGCA TACCTGATCT 

1637 51 GCACAAGTCG CACTTAAAGA GAAAATATTT GTAGCAATTG TATTGAAATC 

163801 TTGTTCTTCA TGGTGAGTCT CTAAAGCTTT TTGGAACTCG TCTCTACTTA 

163 851 ATGTAGAGTT CAATTTTTCT TTCCTAATAT TTAGAAGATA GTAAAGACCC 

163 901 TCAGTGAGAC TTCCTATGAG CTGAATCAGA GGATAGAAAA TATAGTGGGA 
163951 ATAATAGAGA ATCGGTGCTC CCCAAAGTGC TAATTTTTCA GGAATCTTCC 

164 001 GTGATATTGT TAGAGGTAGA AGTTCTGCAA AAATCACAAC TATAAAAATT 
164051 TGAGTGAAAG GAGCGTAATC TGGAGTGATT CCTAAAGCTC GATAGCAATT 
164101 TCTTGAGGAC TCAGACCCGA CTTGTAGAGC GATATTCACT CCTAACATCA 
164151 CCGTTCCAAA TAAACGATAG GGGCGGCGAA TCAGGAAATT AATGTAGCGA 
164201 GCTTTCTTAT GATCTTTAGT CAGATAGTAT TGCAATCGTA CACGGTTAAA 
164251 TGACACGCAG GCCATTTCCA TCATCGAATA GAATCCTTGT AAGACAATAC 
164301 AGATAATGTT GACTCCTATC CAAAAGAGAG CAGAATTAGT CATACAATTT 
164351 CCTTATATAC ACACGGCGAA TGCGATTCGG AGCAGCGTCT AATACCTGGA 
1644 01 AAAGCAAGTT -ATTCCAAGAG AGTTTCATTC CTGTTGTCGG AATCGTTCCG 
164 451 ATTTGCTCTA TTAACCAGCC TCCTATAGTC GCAATATTAT TGTTCGTCGG 
164 501 TAGGTTGATA TCGAAGATCT CACTAAACTC ACGGAGTTCT AAAGTTCCTG 
164 551 AGGCAATAAT AACATCAGCT CCTGAGGTGG TATAGAGTAT TTTATTATCT 
164 601 CTCTGGTCTA CAATTTCTCC AGCAACAATT TCAAAGAGGT CTTCTTGAGT 
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164 651 GATCAATCCT TCAATAGATC CGTATTCATC AATGATCATC CCTAGGGTTT 

164701 CGTCTTCAGC TGCCATCTGA CATAAAGCCA TTTTTGCAGA GATGGTTTCT 

1647 51 GGCATATAAT ACGGTTTTTT CAGCAAGGGG AGGAGATCAT CCGAAGATTG 

164801 CAGTGGCTTG TCATGTAAAA GAAGAGAGCG CGCTGTGCAA ATGCCCAGAA 

164 851 GGTTTTGGAG GTTATCGTTA CATATAGGAA CTCGTGAGCA ATGCTGTTTA 

164901 GAAAATAAAA GATAGAGGTT CTCTAAAGGG GTTTGGATAT CATAAAATAA 

164 951 AATATCCTGG CGTGGCTGCA TACGCTCTTT AACACTACAA TCACTAAGAG 

165001 AAAGATAACC ATAGAGTAAA CGGCTTTCTT CTTGATTGAC TACGCCGAAA 

165051 TCCTTACAAC TTTGCAATAC TTCCTTCAGC TCTTGGGGTT GG ATGATATC 

165101 AATCTGTTGC TTCGATAAAA' TCCATTGGAC CACATAATTA ATTCCTACGA 

165151 TACCCCAGTG GAGTAGGGGT TTGAAGATTT TAGTAACACA AAGAATAAGA 

165201 GGGGCTACGG AACTAGCAAT CTGTGTATTA AAAGGAAGAG CTACTGCTTT 

165251 AGGGAGAATC TCACCTAAGA TCAAAGTAAT TGCTAAAGGA AGACCTACAG 

165301 TAAACCACCA CGAAGCTGCA TCTCCAAATA GAATGGCAAA ACAGTTTTGA 

165351 ATAGCAATAT TCAGTCCGAT ATCACAAAAA ATTAAGGTGA TGAGCAGGTG 

165401 GTGGGGATGT AGAAGAAGGG TAGCTACTCG CTGCTGTTTC TTAGATTTAG 

165451 AGCGCTTATA GTGCGAGATC AAACTCGTAG GCAAAGAAAA CAAAGCAATT 

165501 TGAGATAACG AAATGAATCC CGAGCATAAA GTAAAACAGA TAATGAAGAA 

165551 CATTAACATG GTAGGAATCA TGGTCTCTTT TCAGTCCTTA TTTTCTGATT 

165601 GTTGCTTTGG GGAGACACAG AGTTTCTTAT AGCCTTTAGG AATGAGACCC 

165651 AGACGCTGGA TTAAAGCAGC TTCTATAGCA GCGGAGTCTT GCCAGTGTTG 

165701 CAGATGTAAT TGGAGTTGAC GCTGCTTTTC TTGAGCAGAA AGAATGTCTT 

165751 GGCATAAAGA AGAGACCTTG CTTTGTAAQC GTAGCTCTTC TGTACGTAAC 

165801 TCCTGGATAG CACGATCATA AACAAAGCCT CCAATTAAGA TGCTAAAGAT 

165851 CACCCACCAG GATTTGATCA TCACTTCTTC TAGTAATCTA AAACCCCAGT 

165901 -pTTTTTTTCT TACTGAAACT TTAGACACAA GGTACGGTGA TGCCTTGTTG 

165951 CTTTTGATAC TTTCCTTTTC TGTCTGCATA AGAAACCTCG CAGGTCGTAC 
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166001 TAGACTCAAA GAATAAGACC TGGGCAATCC CTTCATTAGC GTAAATTTTC 
166051 GCTGGCAATG GCGTAGTGTT AGAAATTTCT ATAGTCACAT GCCCTTCCCA 
166101 TTCAGGCTCA AAAGGTGTGA CATTTACGAT AATTCCACAG CGTGCATATG 
166151 TAGACTTTCC TATACACATT GTTAAGACAT TTCTAGGAAT TCGGAAATAC 
166201 TCAACGCTAC GAGCTAGAGC AAAAGAATTT GGAGGAACAA TACAGACGTC 
166251 ATCAGTAATA GAGATGAAGA TATCCTCAGT AAAGCATTTT GGATCAACAA 
166301 CAGAGTTATA GACATTGGTG AACACTTTGA ATTCTCGAGA TAGGCGGAGG 
166351 TCGTAACCAT AACTCGATAG GCCGTAACTT ATAAGTTTTT CGCCTGTCTC 
1664 01 CTCATTTACG TTCACTTGGC CATTAACAAA GGGATGGATC ATATCGGCAT 
166451 TTAGGGCCAT CTCTCGTATC CACTTATCTT CTTTTATGCT CATTTAGAAA 

166501 CCTTAACAGT TTGAAATTGC TTTCTTAATG ATATTCTGTT TTTCAATTTA 

166551 CTGGTTTTTG GGGGGAACTT TTCTAAGTAT AAGATAGACT TTGATTATCT 

166601 CTTGAAAAGA CCAGTTGTAT AAACAAGAAA AGCCTATCCC AAAGGCTACA 

166651 ATTTTATCAC GAAACCTAGA GGTAATGTTA GATAATCCTA AGGGAAAAAG 

166701 GCAAACCTTA TTTTTAGGGA GAACTTCAGG TAGGTCTGCT CTTTACTCTT 

166751 ATAGTAGAAG AATCTTGGTT CTCTTGAATG CATTCATGCG AGGACCTTGA 

166801 TAAGAACTTC TTGGATTCAT AAAAAGATTA ACATCTCCTT ATTGATAAGC 

166851 TAGAGAATTT TTACTACCAA CTTCTCAGTG GAAAATGTTT TTAAAAATAG 

166901 TTCGCCATCT TTAATTTATC TGTTTTAAGA CAAAAGAAAT CTAGATCACC 

166951 ACAGGAAGTT TAAATCATAA AATGAAAATG ATGGAGAGGT TCTAGTGCTC 

167001 GTACTTTGGC CCTGCTCTCC TTGATAGAAA GAAGAGGTCC ATAGTGTACT 

167051 TCTATATAGT ATCTCGTGTA CTATGCCGAG TATAACCGAT CGGCGTTATC 

167101 GATGAGAGTT TCAAAAAAAT ATAAAATCCA CCTAAAGAAA AAGCGATAGA 

167151 GAAGGTTCGT ACATGACGCA TCAAGTAGCT GTCTTGCATC AGGATAAAAA 

167201 ATTTGATGTT TCGTTAAGAC CTAAAGGGTT AGAAGAATTT TATGGACAGC 

167251 ATCATTTAAA AGAACGCCTA GATCTATTTC TTTGCGCAGC ATTGCAACGA 

167301 GGAGAAGTTC CAGGACATTG CTTGTTTTTT GGACCCCCAG GCTTAGGGAA 
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1673 51 AACCTCACTT GCTCACATCG TTGCCTACAC CGTGGGGAAA GGGCTGGTCT 

1674 01 TGGCATCAGG GCCTCAGTTA ATCAAACCCT CGGACCTGTT AGGACTTTTA 
1674 51 ACTAGTTTGC AAGAAGGGGA CGTGTTTTTC ATCGATGAGA TCCATCGTAT 
167501 GGGGAAAGTT GCTGAGGAAT ACCTGTATTC TGCAATGGAA GATTTCAAAG 
167 551 TCGATATTAC TATAGATTCA GGACCCGGAG CTCGCTCGGT CCGTGTCGAT 
167 601 CTTGCTCCTT TCACTTTAGT GGGGGCAACG ACTCGATCAG GAATGCTAAG 
167651 CGAACCTTTA AGAGCACGCT TTGCTTTTAG TGCGAGACTT TCCTATTACT 
167701 CGGATCAAGA TCTAAAAGAG ATTTTAGTCC GCTCCTCACA TTTACTCGGA 
1677 51 ATCGAAGCTG ACAGCTCCGC ATTACTAGAA ATTGCTAAGA GATCCCGAGG 
167801 GACGCCACGA CTGGCAAATC ATCTTCTACG TTGGGTCAGA GATTTTGCTC 
167 851 AGATCCGAGA AGGAAACTGT ATCAATGGGG ACGTAGCAGA AAAAGCTTTG 
167 901 GCTATGCTAT TAATAGATGA TTGGGGATTG AATGAAATTG ATATCAAACT 
167 951 TCTCACTACA ATCATCGACT ACTACCAAGG TGGTCCCGTT GGAATTAAAA 
168001 CCTTATCGGT AGCTGTGGGA GAAGATATCA AAACTCTTGA AGATGTTTAT 
168051 GAACCGTTTT TAATTTTAAA AGGTTTTATC AAAAAGACTC CCAGAGGCAG 
168101 AATGGTAACA CAACTTGCTT ACGACCATTT AAAAAGACAT GCAAAGAACT 
168151 TATTGAGTTT AGGAGAAGGA CAGTGAAACT ATTGAAAAAC GTACTTTTAG 
168201 GTCTTTTCTT CAGTATGAGT ATCTCAGGAT TCTCAGAAGT AAAGGTATCC 
168251 GATACTTTTG TGAAGCAGGA TACTGTCGTT GAACCTAAAA TTCGTGTCCT 
1683 01 TTTATCTAAT GAAAGCACCA CAGCTCTCAT AGAAGCCAAA GGTCCTTATC 
1683 51 GCATTTATGG AGATAATGTC TTATTAGACA CAGCGATTCA AGGCCAGCGT 
168401 TGCGTGGTCC ACGCTCTATA CGAAGGGATC CGTTGGGGAG AATTTTATCC 
168451 CGGACTCCAG TGTTTAAAGA TCGAGCCTGT AGATGACACT GCTTCTCTTT 
168501 TTTTTAACGG GATTCAGTAT CAAGGTTCCC TATACGTTCA TCGTAAAGAC 
168551 AACCATTGCA TCATGGTTTC TAACGAAGTT ACAATCGAAG ATTATCTGAA 
168601 ATCTGTACTT TCTATAAAGT ACCTTGAAGA GCTAGATAAA GAAGCTCTAT 
168651 CTGCTTGCAT CATTCTAGAA AGAACCGCTC TATACGAAAA GCTCCTTGCA 
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168701 AGAAATCCTC AAAACTTTTG GCATGTTAAA GCTGAAGAAG AAGGGTATGC 

168751 AGGATTTGGT GTGACCAAGC AGTTCTATGG TGTAGAAGAG GCTATAGACT 

168801 GGACAGCTCG TTTAGTTGTG GATAGCCCTC AAGGATTAAT TATAGATGCA 

168851 CAAGGGCTCT TGCAGTCCAA CGTAGATCGT CTTGCTATAG AAGGATTCAA 

168901 TGCACGTCAG ATTCTTGAGA AGTTCTACAA GGATGTGGAT TTTGTAGTTA 

168951 TAGAATCCTG GAATGAAGAA CTGGACGGAG AGATCAGGTA ACCTCTTTCG 

169001 CATGGCTGAT CGCAATTAGC GTGGTATGGG GCTGTAGCGA CACTGTCGGC 

169051 GTTGCTACAT TTTGAGGGAC AAACCCTTGC TGACTCTCGG CAACTATTTG 

169101 ATAAGGAAGA AAGTTGCTGG AGGCTTTAGG TAAGGTCGCA AGTTGGTCTT 

169151 GAGCTCCCAC GTGAAAAGCA ACATATACAT GCGCTTTTGG CGATTTTATT 

169201 TTAAATGCTA AGAAATTTCC AGGGCGCCAT GTCATGGGAT TTCCCATAGC 

169251 ATCTACCCAA CTGATTTCCT TATTGGAAAG AAAGCCTCGA TTAAAAAGTG 

1693 01 TTTTATATTT TTTTCGAAAC GCAATGAGAT CACAGAGAAA GTGCATCAGT 

1693 51 GTAGGCTTTG CGGTAAGCTG ATCCCAAAGG AAGTAATTCG CATTCGAATC 

169401 CAAAGCCCAA CGGTTGTTAT TGCCTTCCGC GGTATGGGCA TACTCATCTC 

169451 CTGATTGAAT CATCGGAATG CCTTGCGAGA CCATCAAAGT AAGGAAAAAA 

169501 TTTCGTAACT GTCTTTCACG AACTTCAAGA ATGCCAGGGT CTTCTGTTTT 

169551 CCCTTCCGTT CCGAAATTGT AGCTGTAGTT CGCATCTGTG CCGTCACGAT 

169 601 TATCCTCTCC GTTAGCCTCA TTATGTTTGT GGTTATAAGT CACAGTGTCA 

169651 CATAACGTAA AACCATCATG GCAACTGACA TAGTTAATCG AATTTGTAGG 

1697 01 CGAGCCGTGA GGATAGATGT CTTGAGATCC TGAAATTCTA GAAGCAAAGG 

1697 51 TTCCTATGAG ATTTTGATCC CCATTAAGAA ATGCTTTCAC GTTATCACGA 

169801 TACGGGCCGT TCCATTCACT CCATCTTGGA GACAGTGTGG GGAAATAGCC 

169851 CACCTGATAC AAACCGCCAG CATCCCAAGG CTCAGCTATA ATCTTTGTGC 

169901 TCGCAAGTAA AGGATCAAAA GAAATCGCCT CTAAAACAGG AGCGAATTGT 

169951 AGGGGAGATC CCGAAGGACC ACGAGAAAAG ACAGAAGCAA GATCAAATCG 

170001 GAACCCATCG ACATGCATTT CTTCTACCCA ATAACGTAAG ATGTCGAGAA 
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17 0051 TCCATTGGGT CGTGGGGGCG 

170101 GAATAATTTG TAAAGTGACC 

170151 GTCTATCCAA GGCAAAGAGC 

17 0201 AAACAACATC AAGAATGACC 

170251 AAAGTTTTAA ACTCTCTACT 

17 03 01 ACGTCGGCAA GGAGAAAAGA 

17 03 51 ACAGATAAGG GAATTTCGAA 

17 0401 TCAAAGATAG GTAAGAGTTC 

170451 GTCGATCTTT TCAATGATTC 

17 0501 ATGAAGAAGA TTGCGTGAAG 

170551 TCTTCTTTCG GCAAATGCAG 

17 0601 TTCCTTTAAA TAACAAAATG 

17 0 651 AACTCTGTGG GGAATGAATA 

17 0701 TTAAAAGAGT ATTGCATTCC 

17 07 51 ATAAGACGAT TGATCAGAAA 

17 0801 CCGTGCGGTG TGTATCGGGG 

17 0851 TCGTCTGTTA AAGCAAGGAT 

17 0901 AAATCGATAG CGGTTTGGGG 

17 09 51 CTGAGGGATA AGAAGAAACT 

171001 CGCCAGAAAT ACAAGGGAGT 

171051 TAACAAATCA CTCTTATGGT 

171101 AGATTCCCTC ATGTCCAGGC 

171151 CTAAAGAATT AAAACTGCCG 

171201 ATTTTATTTG TTGATGGTGA 

171251 TTCTGATCGT CTTTATGTCT 

1713 01 ATACTCAGAG GAAATTGGCC 

1713 51 CTTGGCGGTC AGATGGCTGG 
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CGGTTTGTAT TGAGAGTGTT TCCACAGCCT 
TTGTGCATCT AAAATATAAT AGCTCGGAGT 
AGGTCGTCCC TTGCAAGCCC GTATGATTAA 
TCAATACCTT CTTGATGCAA GGTCTTTACT 
TGGAGCGCAA GGATCAGAGG CATAAGCATA 
AATTTAGGGG AGCATAACCC CAATAATTGC 
TTTCTAAAAG GATGCGCAGT CTCATCGAAC 
AACAGCGTTG ATTCCCAGCT TATGCAGATG 
CTAGGAAGGT TCCCGGAGCA TGAACCCTAG 
GAACGTACAT GCATCTCATA GATGATCATC 
AGGCTGATCA CCATCCCAAG GAAATGGTTC 
CATAATCCCC CTGTTTCTTT CGCGAACCAA 
TTCTTCGCAT AGGGATCTGC AAGATATTCT 
ATGCTTTTTA GGCCCATGAA CACGAAATGC 
TACCCTCGAT CTCTATATGC CAAATCGCAC 
TAAAGAGGGA CTTCTATGAC TTCTGAATTT 
GACTTCGGTA GCTTGTGAAG CATATAAAGC 
AAATTTTAGA AGCCCCAAGA GGTAAAGGAA 
TTTTCCATCG TCGATCAGAT ATCATAACGG 
TTTTGATTTT AAGCCTCTGA AATCGTTTAC 
AACTTGAAAC AACAACAATC TATTACAAGG 
AAAATGCTGA GGAAAATCTA AAAAATTTTG 
GATGTAGCTT TTGATCAGAA TAACACGTGC 
GTTCTCTCTT CACCTTACTT ATGAAGAAGA 
ACGCTCCTCT GTTAGATGGA CTCCCTGATA 
TTATATGAGA AATTATTAGA AGGATCTATG 
AGGTGGAGTA GGTGTTGCTA CTAAGGAACA 
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1714 01 GCTCATTCTC ATGCACTGCG TTTTGGATAT GAAATATGCT GAAACGAATT 

1714 51 TACTAAAAGC ATTTGCTCAA TTGTTTATTG AAACTGTTGT TAAGTGGCGT 

171501 ACCGTGTGTG CTGATATTTG TGCTGGTAGA GAACCTTCTG TAGATACAAT 

171551 GCCACAAATG CCTCAAGGCG GTGGTGGAAT GCAACCTCCT CCTACAGGTA 

171601 TTCGTGCTTA ATTAAGCATA TAAGTACTAC ATCTTTTTTC TAAATTTAAA 

171651 AGATTGATAA AGCTCTTCTT AGAGAAGAAA TGACTTTATT GACTTCTCGT 

1717 01 TTTTGTCCTG AATCTGTTTT GAGAAGTTCT TAAGATGTAT TAATTAAAGA 

1717 51 AGTATTAAAA ATAGAAATCT AAAGGCTATC TTATGATGTT TGGGCATTTT 

171801 GCTGGTTACC TTGGAGCAGA TCCTGAAGAG CGAATGACTT CCAAAGGAAA 

171851 ACGTGTGATC ACTCTGAGAC TGGGAGTGAA GACTCGAGTT GGAATGAAAG 

171901 ATGAAACTGT TTGGTGCAAA TGCAATATTT GGCACAATCG CTATGATAAG 

171951 ATGCTTCCTT ACTTGAAGAA AGGCTCAGGA GTCATTGTTG CTGGCGATAT 

172 001 CTCTGTAGAG AGTTACATGA GCAAAGATGG TTCACCGCAA TCTTCTTTAG 

172051 TGATTAGTGT AGATTCTTTG AAATTCAGTC CTTTCGGTCG CAATGAAGGC 

172101 AGCCGTTCTC CATCTTTAGA AGACAATCAT CAGCAAGTGG GATATGAATC 

172151 TGTATCCGTA GGGTTTGAAG GTGAAGCACT GG ACGCAGAA GCTATTAAAG 

172201 ATAAAGATAT GTATGCTGGT TATGGTCAAG AACAGCAGTA TGTCTGTGAA 

172251 GATGTTCCTT TTTAATTCCT AGTCATTAAA GGAGAGTTTG TGGTTTTATT 

172301 TCATGCTCAA GCCTCTGGGC GTAATCGTGT TAAGGCAGAT GCTATAGTCC 

172351 TGCCCTTTTG GCATTTTAAG GATGCAAAAA ATGCAGCTTC TTTTGAAGCC 

1724 01 GAGTTTGAAC CCTCGTATCT CCCCGCTTTA GAAAACTTTC AAGGAAAAAC 

172451 CGGGGAGATT GAACTCCTTT ATAGTAGTCC TAAAGCTAAG GAAAAACGCA 

172501 TTGTCCTCTT *AGGCTTAGGG AAAAATGAAG AGCTCACCTC TGATGTTGTT 

172551 TTCCAAACCT ATGCGACACT AACTCGTGTC TTACGTAAAG CAAAGTGTTC 

172601 CACAGTCAAT ATCATCTTAC CTACAATTTC TGAATTGCGG CTTTCTGCCG 

172651 AAGAATTCTT AGTGGGGTTG TCCTCAGGAA TTTTGTCATT AAACTATGAC 

172701 TACCCACGTT ATAATAAGGT AGATCGTAAT CTTGAAACTC CTCTTTCTAA 
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1727 51 AGTCACGGTT ATCGGTATCG TTCCCAAAAT GGCGGATGCT ATCTTTAGGA 

172 801 AAGAAGCAGC CATTTTCGAA GGCGTATATC TCACTCGAGA TCTTGTGAAC 
172851 AGGAATGCTG ATGAAATTAC CCCTAAGAAA TTGGCAGAGG TTGCTCTGAA 
172901 TCTGGGAAAA GAGTTCCCTA GTATTGATAC TAAGGTCTTG GGAAAAGATG 
172951 CCATCGCCAA AGAGAAAATG GGACTCCTAT TGGCTGTTTC CAAGGGTTCT 
173001 TGTGTGGATC CACACTTTAT CGTTGTCCGT TATCAAGGAC GTCCTAAGTC 
173051 TAAAGATCAC ACCGTCTTGA TAGGGAAAGG GGTCACTTTT GACTCTGGAG 
173101 GTTTAGACCT CAAGCCTGGA AAATCCATGC TTACTATGAA AGAAGACATG 
17 3151 GCAGGTGGGG CTACAGTCCT CGGGATTCTC TCGGCGTTAG CAGTTTTAGA 
17 3201 GCTTCCTATA AATGTCACGG" GGATCATTCC TGCTACAGAG AATGCTATCG 
17 3251 ATGGCGCCTC CTATAAAATG GGAGATGTCT ATGTAGGAAT GTCGGGGCTT 

1733 01 TCTGTTGAGA TTTGTAGTAC CGATGCTGAG GGACGTCTTA TCCTCGCTGA 

173 3 51 TGCGATTACA TATGCTTTAA AATATTGTAA ACCGACACGT ATTATAGATT 

1734 01 TTGCAACTCT AACAGGAGCT ATGGTAGTCT CTCTAGGAGA AGAGGTTGCA 
1734 51 GGTTTCTTTT CCAATAACGA TGTTTTAGCT GAAGATCTTT TAGAGGCGTC 
17 3 501 AGCCGAAACC TCCGAGCCGT TATGGAGACT TCCTCTAGTT AAGAAGTATG 
173 551 ATAAAACATT GCATTCTGAT ATTGCTGATA TGAAAAATCT AGGCAGTAAC 
173 601 CGTGCAGGGG CTATTACAGC AGCATTATTC TTGCAGAGAT TTTTGGAAGA 
17 3 651 ATCTTCGGTA GCTTGGGCAC ATCTTGATAT TGCAGGTACT GCATATCATG 
1737 01 AAAAAGAAGA AGACCGTTAT CCAAAATATG CTTCAGGTTT TGGTGTTCGT 
17 37 51 TCTATTCTTT ATTACTTAGA AAATAGTCTT TCTAAGTAGT TGCTTTCTAT 
17 3801 TTATTTATGT TTTAGTAATG ACTTTTATTT TAGTTTTTTT AAAATAAAAG 
173851 TCATTTTTTT TATTAAAGTT TTCAATCGTC CCTGCCGATA GATCAGGTAA 
173901 GTAATTACCT GTCTAATTAG GGG AATAAAG ATGATTGGAG CGCAAAAAAA 
17 3951 GCAAAGCGGT AAAAAGACAG CTTCAAGAGC TGTACGGAAG CCTGCTAAAA 
174001 AAGTTGCGGC TAAACGTACG GTTAAAAAAG CTACTGTTCG CAAAACCGCT 
174051 GTAAAAAAAC CTGCAGTTCG TAAGACGGCT GCTAAAAAGA CAGTAGCAAA 
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174101 GAAGACTACA GCTAAGAGAA CAGTTCGTAA GACTGTTGCT AAGAAGCCTG 

174151 CAGTTAAGAA AGTTGCTGCT AAACGTGTAG TAAAAAAGAC AGTAGCAAAG 

174201 AAGACTACAG CTAAGAGAGC GGTTCGCAAG ACTGTTGCTA AGAAGCCTGT 

174251 AGCTAGAAAA ACTACAGTGG CTAAAGGTTC TCCTAAGAAA GCTGCAGCCT 

174301 GTGCTTTAGC ATGCCACAAA AACCATAAGC ATACATCTAG TTGTAAACGT 

1743 51 GTCTGTTCTT CAACAGCTAC GAGAAAGCAT GGCTCTAAAA GCCGTGTTCG 

174401 TACAGCTCAT GGCTGGCGTC ACCAACTGAT CAAAATGATG TCTCGATAAT 

174451 TTGTGATTTT CGCATTATTG CTCATGTTAA CGGGAAAGGG AAACATTGGG 

174 501 TTTCCTTCCC GTTTTTCTTT TTAAGGTTAA AAAGCTTTAT AGAGCGAGAT 

174 551 CTTCAGGCTT CATGCTGTAC AGTTGGTAGG AAAATACGTA TAGTAGGTTC 

174 601 AGGATACTAC TTTTTTGACT CTACCTATGC AAAAATCCTT AACGAGTTTT 

174 651 GATGACTTTT CCCAGGCGTA TGCAGAGAAA GTGCCCGCTA TAGCTCTTAT 

1747 01 AGGGAGTGCT TTGGAAGACG ATAAAGATGC GCTGATTGAA TTATTAGTCT 

1747 51 CTGAGAGCTT CAAAGAGCTC GGTGGTCAGG GACTCATGCC AGCAACCCTC 

174 801 ATGTCTTGGA CCGAGACGTT TGCACTCTTT CAAGAGCATG AAACTTTGGG 

174 851 GATTATTCAT GCAGAGAAAT TCCCTCTAGC AACTAAGGAA TTTCTAAGCC 

174 901 GCTATGCTCG GAATCCTCAA CCTCACCTTA CGATTTTGAT CTTCACCACA 

174951 AAACAAGAAT GCTTTCGAGA ACTGTCAAAA GCCTTGCCAT CGGCTCTTTC 

175001 TTTGAGTTTA TTTGGTGAGT GGCCCGCAGA TCGTCAGAAA AGGATCATAC 

17 5051 GCCTCCTGTT GCAAAGAGCT GAGCGTGTGG GGATTTCTTG CTCTCAATCA 

17 5101 TTGGCATCTT TGTTTTTGCG TGCACTTGCT TCAACCTCTC TTCCTGATAT 

175151 TCTCAGTGAA TTCGATAAGC TACTGTGCTC TGTTGGCAAG AAAACGTCCT 

17 5201 TGGATCACTC TGATATTAAA GAGCTCGTTG TCAAAAAAGA AAAGGCTTCC 

175251 CTATGGAAAT TTCGAGACTC TCTATTGAAG AGGGATCCGG TAGAAGGTCA 

175301 CCAGCAGTTG CATTTTCTAC TCGAGGATGG TGAAGATCCC TTGGGGATTA 

17 5351 TTACTTTCCT TCGTACCCAA ■ TGTCTCTATG GTTTACGTAG TATTGAAGAG 

17 5401 GGATCGAAAG AAAATAAACA CCGAATGTTC GTCCTTTATG GAAAGGAGAG 
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17 6801 AAGTTGGCAA ACGCATCAAG 

17 6851 AAGTACGGGG AGAGGCCGTT 

17 6901 TCTCCTTGGC CCATCAGTCC 

17 6951 TTTATGTCAG GCCCTTTTGA 

177 001 AAAATCAGAA TCGTGAGTTT 

177 051 TATATACAAC AGGCGTTAAC 

177101 TGTAGGGATT TCCCCGACGG 

177151 ATTACCGTGT TTCTGGAGAC 

177201 AGGGCGGAGA ACACCACGAT 

177251 TTTTCTGCCT TCTATCCGTT 

1773 01 GAGAGCAATG CTTGGTTGCG 

1773 51 GGTCTTTCTC AAGATGCCGA 

1774 01 TCTAGCTGTT GTCATTGATT 
1774 51 TAGAATGGTT GAATCAAGGA 
177 501 TATCCCCAAC GGTGTCCTGA 
177 551 TAGAGTTTCA GGACTTGCGC 
177 601 GCTTAGACTT GCAGATCTGT 
177 651 GCGGTCCGGT ATTTTTTCTT 
1777 01 GAGATTCCGT ACTGCACGAA 
1777 51 TCCTTGTTCT TGTTTTTGAT 
177 801 CCTCAGGAAA TCTTACGAGC 
177 851 GAGTATTGTC TTTGTAGAGT 
177 901 ATCGTGTACA ACAATTTGTG 
177 951 GTAGGAACAG TACGAGCTTC 
178001 CTTTTTACAA ACTGTACATG 
178051 TGCTTAACCA GATTGCAATA 
178101 AACACTGCTG TTTATGATCT 
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AAGCTGTACA GGAATTGCTG TATATTAGTG 
CGTCTTCTCT ATAACGATGG AAGCGGCATG 
TTGTCGTACT CTTCCTACTC TCGATCATCC 
CGGTTTGGGA ACAGTTTTTC TCTGCTCCTG 
CTAGTGATTT TCTATGGGGA TGCATCGCCT 
GCAATCTAGG CATAGTCCAC GTATTGTTGT 
TCTTTATTCA AGGAGACTTT AGGGTCCATA 
TTCTTTAGTT CTCTGGATTG TCGGGGAACT 
ACTGCCGTAT TCTTCGGGTC TTGAGGGTGT 
GTCCTTCTTT TACTTGGGCG GTGCGTTTTG 
AATAGGGGTG AGGATGTAGA AGATAGGGGA 
AAGATCACAG TTACCACACA GTGAAAGAGA 
CTACGGATCC TAGTTCTATG AGTAGGCTTG 
TCGCCTTCAT CAGATATGGA AATCAATCCC 
TGTAGCTCTT TCTGCGCTTT ATGCAATTTC 
AGGAATGGAT CCTAGCCTCT GTTCATGAGG 
TACTCTTTAA TTTTGATGCA CACGACGTTT 
ACTCTTTACA AATTATCCTC AGTCTAGAGA 
TCGTAGCACA ATCTCTATAT TTACCAAGCA 
TGTGGCAACG TCCTGCGTAA ACTATGGATG 
AATTTTTATT TCTGCGTCTA CAATTTCAGG 
GCACTCGCTG GATGGGGCGA GGTCTTAGAC 
CAGCAACGAG TTATAGGAAG TGGCCTGCCT 
TTATCGCGAT CGTGCAGGCT TTATCATAGG 
GAGGACTTTA TTTGCCGGTA TCCATTATGG 
CAAGTTCCAC GTATCTTAGT ACGTCCAAAT 
ACATAATAAA AGTGCTGAAG AAAATTGGAG 
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178151 CAGTGGTGAT GTATTAGCTG TTGGCCAAAC ATTAAACTTC ATCTTATGTG 

178201 CTTTCGTCTT GTTCGTAAAT CTATGGTTCT TTGTGAAGTC CGTATTACGC 

178251 CATTCTAGGC GTCGTCGTCG CTAATGTATC TTTGGAGACA TCTTGGTATT 

1783 01 TTTAACATAA GAGCACGGCA AAAAAAAGAA AGACCGAGAA AGAAGCTTTA 
178351 CCTTTAAGAG TTCTCTGATG TTTCCCGGAG CATCAGGAGT CCGAAGGGCA 
17 8401 GAGTCTAGGT TTTAGCCTTC CGCAGGGATT AGAAGGAATA TCCCATAATC 

1784 51 TCTTCCGCTA TTGTATCGTG GAAGAGACGG CCCTGTCTAT TTAGGGCAAG 
178501 ACATTGTCCA TGCACACTGA ATAGGTTTTG TAATTTTACA TCTTGCGTAA 
17 8551 GCATGGAGAT AAGTGTGGAG GGGAACTCCG CGAGGTCTGC TCCTTCAAGG 
178601 AGTCGGAGTC GCAGGGCTAA GGCTTCTTTG ATTCGTTCTT TTTTTGGGAG 
178651 AATTTCTGAG GTCTCTTGGG TAGGGAGATT CTTACGTACA GCACGTAGAT 
178701 AGTGAGAAAT ATGACTATAA TTTTTTGACC GCTCTCCGTG AAGGTATTGC 
1787 51 GAAGCTGAAA CTCCTAAGCC TAAGAAAGGG CGATCTGTCC AGTAATAGAG 
17 8801 GTTGTGCTTT GCGGGGTAAT CTGGCTTGGC ATATGAAGCA AGTTCATAGC 
178851 GTTGGAACCC TTGGGAGAGT AGGAGATTTT CAGCAAGGAG GCTCATCTCA 
17 89 01 GCTAGAATTT CTTCTTGGGC AATTGTGGGG ACTAGAATTT TGCGGTGTTT 
17 8951 ATAGAAGGAG GTGTGGGGAT CTATAGTGAG GTTGTATAGA GAAATGTGAG 
179001 TGATAGGGAG AGTCAGAGCT TGATGTAGGT CGCTTAGGAA TATCTCCAAA 
17 9051 GACTGTGTGG GCAGTCCGTA GATTAGGTCT ATAGAAAGAT TAGAGAATCC 
179101 GTGATTCTGG CATTCTTGCA GTGCTGTGAT TGCCGCAGAT GAAGAATGCG 
179151 TTCTTCCGAG GAGCTGTAGG ATAGAGTCGT CGAAGGTTTG TACGCCAACG 
179201 CTAATTCTAT TTATTGGAGT CTCTTGTAGT TGACGTAGAT AGCTTACGGT 
179251 GAGATTTTCG GGGTTGGCCT CTAAAGTAAT TTCCCGGGCA TGGGGGGCTA 
179301 GCTCTTTGAG GATGCGCTTA AGATCAAGAG GAGAAACTAA TGAAGGTGTT 
179351 CCCCCTCCAA AAAACACAGT CTCTATGAAA TGCGTCTCTT GGATGGGGGC 
1794 01 TAGCTTTCTT AGCCCCTCTT GAATTACAGC ATTACAATAG AGCGATACAG 
1794 51 ATTCACTTTT GTAGGGGATT GTATAAAAAC TGCAATAGCG ACATTTTTTT 
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179501 GTGCAGAAGG GAATATGAAT ATAAAGAGCT 

179551 TAGCATCGGG ATCAATGATA CCACCGCGTC 

179601 AAAGGGTCTT CATCGTCTTC GTGGGAATAG 

179651 TTCACGTCCT TCATCGGTGA GTTCCAGTTC 

179701 TATCAATTTC TGTTTCTGTG GGTTCCACAA 

179751 GGACCATCAG GGTAGACAGT CTTTGCGGGA 

179801 CTCTTCAATC TCTCCGTTTT CTTTGAGGAG 

179851 CGTAAATTTT ATGTTCTAAG ATTCGGATTT 

17 9901 TCTTTTTTTG GCACTAAACC CAACTGCATC 

17 9951 TTCTGATTCT AATTTTTTAA GACGTTCGCT 

180001 CCTGTATTCG AGAGCAGGTG CATAGCATTT 

180051 AAAAAAAACA AAGATTTTTT TTTTTTTGAA 

180101 GCAGACATTT AGATCGTATT TATTGAGTTT 

180151 GTTTGTGGGG CAAGTATATT CTTCGGATAT 

180201 ATCAGAGATT TATGAATCAC GAGACTTTGG 

180251 TTTGAAGGGT ATCAGCTCGG TCAAGCAGCA 

180301 TAAGATTTCT GGGAATGAAA CTATTGCTAT 

1803 51 AGTTTCTATG TACGATTTAT CGTTATTATG 

180401 TCAACGCTTG CCCCAACTAC AGATTCTCGA 

180451 TAAGATTGAT CTGGATGAGC AGGTGCCTTC 

180501 CTCAGGTTTC GGTACGAGAG CTGATCGAAG 

180551 GGAAGTCTTA CTTTAGAAAC CCTAACATGT 

180601 TGTTTGGAAT -CTTATGGAGA AGCGACAAGT 

180651 TCCTTCGCTC CTATAAAGAC TTATGTAAAG 

180701 TTACAGATAA AATTTACAGG TCAGAAACGT 

180751 GACCTTGGTC CCCATGTTGG AGCATCTTGT 

180801 GAATTTCTAA CTACGTTTTA GGAATGGCCC 
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AGGGGAGCCT TACCATTCAT 
TCCAGCGATT GCGATCGCTG 
TCGACTTCTA CTGCCCCATC 
TACGGTTTCG CCTGGGTCGA 
ATTCAATATC AGACATACTA 
CTGCGTCGTG GTGTGACGTA 
TTGTAGACGT TCTTTTTCTT 
CTTCTTGGTG CCTGCTAATT 
CATTGCGTAA GGTCATGAAG 
TTTCATCCAA GGGTTCCCAT 
TTTTGTTTTT TCAACTGCAG 
AAATAAAAAT TTGCTATGAA 
AATTATTTTA TGGATTCCGA 
GGATTGGATC GAGTCTATGT 
ATCCTTCTTG GAAGTATTTT 
TCTCCATCAG AAGCTAGTAC 
GCTTCAAGAA CAAAAATCTC 
GATATTTGCA AAGTCAAATT 
TTCATTCAGG AAAAGATCGC 
TGCGGGTCTA CTTCCTAAAG 
CTTTAAAAAA ATGCTATTGC 
ACTCCTGAGT TGCAGGAGTT 
GGAGCGCTTT GCAGAGCAGC 
CAACGTTTTT TGAAGAGTTC 
TTTTCTTTAG AGGGCGGAGA 
TCATTATGGA TCGGCATTAG 
ATCGAGGTCG TTTGAATGTA 
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180851 TTAACGAATG 

180901 AGACGATCCT 

180951 ATAAAGGGTA 

181001 GTGATGTTGC 

181051 GGGGGTCGTG 

181101 GCAGCTTAGC 

181151 GTGGTTTATG 

181201 GGGTACGCTT 

181251 CACGGGAGTC 

181301 GGGATTCCTG 

181351 AGCTATAGAG 

1814 01 TCATAGATCT 

181451 CCCTCAGTAA 

181501 TATTCGCGAG 

181551 TTTCTGAAGA 

181601 AATCGTGAGT 

181651 AAAAGAATGT 

1817 01 ATGATTGTGA 

1817 51 CGTCTTTGTG 

181801 TCTTTTAGAA 

181851 GGGCGATGGC 

181901 AACCTGAGAC 

181951 ACATTTGGTA 

182001 ACCATCTTTC 

182051 TCCGAATATG 

182101 AAAGACTTTA 

182151 CACAAATCAT 



TTTTGGGAAA GCCTTACCGT 
GCAGCACGTG GTTTAGAGAG 
TGTGCTAAAG TCCCATCAGA 
CAAACGCTAG TCATCTCGAA 
GCTGCCTTGC AACACCAAGG 
AATTTTAGTT CATGGAGATG 
AAACTCTCCA GCTGAGTCGT 
CACATTGTTG TGAATAATTA 
AAGGTCCACC CCTTATTGTA 
TATTTCGAGT* GAATAGCGAG 
TACGCTCTGC AAGTTCGTGA 
CTGCTGTTAT CGCAAGTATG 
CAGCTCCCTT ACTCTATGAT 
CTGTTTAGGC AATATCTGTT 
AACTTTGGCA TCTATTGAAA 
TTCAAGTATT GAAAGGGACG 
CATCACTGCG ATCGCTTAAA 
TGTTTCTTTG GATCGCGAGA 
GTTTCCCTGA CAATTTTCAT 
AAAAGAATGA AAATGGCAGA 
CGAAGAATTA GCCTTTGCTT 
TCTCAGGTCA AGATTCTATT 
TGGAGTGATA CTGTGACTGG 
TGCAGAGCAG GGCTCTGTAG 
CAATTTTAGG GTTTGAGTAT 
GTGTTATGGG AAGCGCAGTT 
TTTCGATCAG TATATCTCTT 

135 



PCT/US99/26923 

TATGTCTTTA TGGAGTTTGA 
TGTTGGGGAT GTAAAGTACC 
AAGATAGGGA AACTACCTTT 
TCTGTAGATC CTATTGTCGA 
TCACGCAGGT AAAGAGCAAA 
CAGCATTTTC TGGTCAGGGA 
GTTCCAGGGT ATTCTACTGA 
CATAGGGTTT ACCGCAGTGC 
CGGATATTGC TAAAATGCTA 
GACGTCGTTG CCTGTATAGA 
GAGATTTAGT TGTGATGTGA 
GACATAATGA AAGTGACGAT 
CAGATTAAGA GAAAGAAGAG 
GGAAGGGCAG TTTGCAGATA 
AAGAGATTCA AGAGAGTCTG 
GATCCAGAAC CCTTTCCTAA 
TAACGGCGAG CTTATTTTGC 
CTCTTTTTCA TATGAGCTCG 
CCCCATCCTA AAATTAAGAC 
AGGTGGGGTT GGTTATGATT 
CGCTATTAAT CGAAGGGTAC 
CGCGGGACAT TCAGCCAACG 
AGATACCTAC TCTCCATTGT 
AAATGTATAA TTCTCCTCTT 
GGCTATGCTC AACAGGCATT 
TGGGGATTTT GCTAATGGTG 
CGGGAATTCA GAAGTGGGAT 
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182201 TTACACTCTG ACATTGTTCT 

182251 ACCCGAGCAT TCTTCATCTC 

182301 ACTGGAATTT TCAAGTGGTC 

1823 51 ATTCTCAGAG AGCATGCTAA 
182401 TACTCCTAAG TTGCTGCTGA 

1824 51 AGTTCACAGA ACCTGGGGGA 
182501 AATTATGATG CTTCTATTTT 

182 551 TTATGCAGAA ATGCTTCCTC 
182601 GTATAGAGAG CTTGTATCCT 
182651 GATAAGTATT CTCATTTGAA 
1827 01 GAATATGGGG GCCTATGACT 
1827 51 CTGAGAAACT GCTATATATA 
182801 GGATCAGCGA AGCTCAGTCG 
182851 CTTTTCTTTA AGGTAAATTA 
182901 CAGAGTCGAT TAGCGAGGTG 
182951 GCTCTGATTC AAGAAAACCA 

183 001 AAATCAGCTC ATTTATGCCC 
183051 CAGAAGGCGA TGTTGTTCCT 
183101 GCAGGTGAAG GGGAAGAGCT 
183151 AGCTGAGATC ATTTGCTTTC 
183201 AGAATAAAAC GTTTATTCCT 
183251 GGTCTTTCTG CAGGAGATCG 
183301 TCGTAAGACA ATTTCGCGGC 
183351 TGCTCACGAC ATTCAATGAG 
1834 01 AAGGAAAAAC AAGAAGAGTT 
1834 51 TATGTCTTTC TTTGTGAAAG 
183 501 GAGTGAACGC CTATATTGAT 
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GCTTCTTCCC CATGGGTATG AGGGCCAAGG 
GTATAGAACG TTATTTGCAA TTAGCCGCGA 
TTGCCTTCCA CTCCTGTGCA ATATTTTCGG 
GAGAGATCTT TCTTTGCCTT TGGTGATCTT 
GATATCCACA ATGTGTAAGT AGTATCGAGG 
TTCCGTGCTA TTCTCGAAGA TGCCGATCCT 
GGTATTGTGT TCGGGAAAGA TCTATTATGA 
AAGATCGGCG TAAGGACTTT TCTTGCTTGC 
TTAGCTCTTG AGGATTTAGT GAGCCTTATC 
ACATTTTGTT TGGCTACAAG AAGAATCCAA 
ATATGTTTAT GGCGTTGCAA GACATTCTTC 
GGACGTCCTC GGAGTAGTTC CACAGCTTCT 
TCAAGAGCTG GTCACGTGTA TGGAAACCCT 
TGACTACAGA AGTACGCATT CCTAATATTG 
ACCGTAGCTT CCTTGTTAGT TACAGAGGGT 
GGGCTTACTA GAAATTGAAA GTGATAAGGT 
CAGTATCGGG AAGAATTTTC TGGGAGGTTT 
GTAGGGGGGG TAGTGGGAAA AATAGAGCCC 
TGGAGATTCT CAGTCTAAAG AGACTATAGA 
CTCAGTCTGG GGTGCGTCAG TCTCCTCCAG 
CTTCGTGATC AGATGGACCA AGGATCCCAA 
AGGAGAAACT CGAGAACGCA TGACCTCGAT 
GTCTTTTGTC TGCTTTACAT GAGTCTGCGA 
GTCTATATGA CACCTCTTTT TCATTTGCGA 
TCTATCTCGA TATGGGGTGA AGTTAGGATT 
CTGTCTTAGA GGCTTTGAAG GCATATCCAC 
GGCGAGGAGA TTGTTTACCG TCACTATTAT 
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183 551 GACATTTCTA TTGCTGTAGG TATCGATCGA GGACTTGTGG TTCCTGTGAT 

183 601 ACGCGATTGC GATAAACTTT CTAACGGGGA GATTGAGCAG AAACTCGCAG 

183 651 ATCTTGCCCT TCGGGCTCGT GAAGGCCTAC TTGCAATAGC GGAGCTTGAG 

1837 01 GGAGGAGGTT TCACAATTAC CAATGGAGGC GTATATGGAT CGCTACTTTC 

183751 GACTCCCATT ATCAATCCCC CGCAAGTGGG GATTTTGGGG ATGCATAAGA 

183801 TAGAAAAGCG CCCCGTTGTT CTTGATAATG AAATTGTAAT TGCAGATATG 

183 851 ATGTATGTCG CTTTAAGCTA TGATCATCGT CTTATTGATG GGAAAGAGGC 

183 901 TGTTGGGTTT TTAGTCAAAG TGAAAGAAGG CCTAGAGAAT CCTGCCTCAT 

183 951 TACTCGACTT GTAATTTCTC TGATTCTCAT AAAGGCTCTT TTAGAGCCTT 
184001 TTCAGATTTT TTAACCTCTT TTCTTATCAT GAAAAGGATT GCACTCATCT 

184 051 TGAGTAAGGA ACAATGTAAA TTGAGTATCC GCACTTACAT TGGCTTGTAC 
184101 CGCAGCCTCA TTCAAGGCGC CTAACAACTT TCCGGGTAAG TACACATTGT 
184151 CTCTTTTAAA GTGAGGTTTC CCGGTATGAT CAAGAGGAGT GAAAGCTTCT 
184201 TCTGCGTGAA TATGTACTGA TGTGAAGCCA TTTTCTAAGA GATCTACAAA 
184251 CAGATTTTTC AAGAACTCCT TTGAGTCTTT TTCAGAAAGC GAAGAGCTCA 
1843 01 TCGGATCTTT TCTTTCTAAG ATATCTTGTA CATCCTTGTC GTAATAACCG 

1843 51 CAAGGAGTCC ATGCTAAAAC TAAAGTTGAA CTGACACCAG GAGTTGGAGA 

1844 01 ATTGAAAATT GCCCATTGAC CTGCTTTTTT TCTAGAGTTC ATTTTAGTTT 
1844 51 TCATAGCTTC CCACTGTTCC TCTCCTAAGA GATCTTGCAG TTGCTTTAGG 
184 501 GGAGCCGAGA CTTCCTTATC CCCTTTCCAA TGCGAAGTTT CTGGTGAAAG 
184551 GAGGAATAGC GAAGGTACGC GAGCCCCGTG ATCTCTAATT TCTGTGAGAT 
184601 TAGGAGCTTT AGCTGTGACA ACCTTAAATT AAGGACCCTT CTTTAGATGG 
184 651 TGAATTTCAC AATTCGTAGC TTCTGTAATT TTGATTTCTT TAGTGAGTTT 
1847 01 TTCTTTGGGA GGAAGAGCGT CGTAAGGTAA GGAACAATGA TGTTAATAGG 
1847 51 GCTAGTTTAG TTAAATAAGA ATAGGGGATC TCAGGTTTAG GAGTGATGGG 
184801 TTGTTGCTCG ATTACAATGA TGGTTAATGG AAGAAGAGCG TTTCTAGAAA 
184851 GTGCATCTGC TTCCTCGTCT TCTTTAGGCA ACAAGCTCAA GATGAGAAGC 
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184901 
184951 
185001 
185051 
185101 
185151 
185201 
185251 
185301 
185351 
185401 
185451 
185501 
185551 
185601 
185651 
185701 
185751 
185801 
185851 
185901 
185951 
186001 
186051 
186101 
186151 
186201 



ACGAGCCCGA 
CGCCTGTATT 
CAGCTGCTAA 
ATTTGAAGAG 
TTTGTTAGCT 
CGTGAACTTT 
GCTTAATTTT 
AGATTACTGA 
TAAGGTCCTT 
CTGGGTAGGA 
GCCTGTGAGA 
AAGCGTCGTG 
TTAGAGATTC 
AAAATGGTTT 
TAGCGTAATC 
GCTACGAAGA 
TACGGTGACA 
CATTTGCGAT 
TCGGAATGGC 
GTCTGCAGGT 
TGCCGTGGCG 
GCAATCTCAG 
TTTTGAAAAG 
TCCAATAGTT 
TAGTTTCTTC 
AATTCTTCTT 
TTTTACATAA 



TGGCTAATCC 
AGAGGGAAGC 
CATTAGTGCA 
CAAGACTTTT 
TCAATCTCAG 
ATGATTAGTT 
TTCCTAAAAA 
ATTCTGAGTG 
AGGGGAAAGG 
GAAGGACGGC 
TTGGGATCGG 
TTCCGTGTAG 
CTAAATTCGT 
TTGGGATTCA 
GGGATAGATA 
GATCTTGTGG 
GCATAGATAT 
TGCGTGGTGT 
GGATATGGAG 
TGGTAGGTGG 
TTGATGAAGG 
GATTCTTGGC 
ACTCCGTGGC 
CAGAGGGTGA 
AGGGTCTTTC 
CAGCATCTTC 
AGATCGATCA 



CACGAGTGTG 
CTAAAGCTAA 
GCGAAAAGTA 
ACTTCTGAGT 
CGAGTTGCGC 
GATGCGCCTA 
TTGTAGTTTA 
GGTTGGAACG 
AAAAGAAGCT 
GGTTACATTA 
GATGGTGAGC 
GTACAGAGGT 
AAGTTGCTTG 
TAAAGGGAAG 
GCATAATCTG 
TTTTGTATGA 
TGCCAAGCAA 
TCTCGATCAT 
AGAGAGGAGC 
GGGATGTAGG 
TCGCAATACT 
TGCGAAGACC 
GTATAGGCAA 
GTGTGGAAGG 
CATACCCCAT 
CATGGGTATG 
TCCCTGTTTT 
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ATAATACCAG 
GAAGGTAACG 
TGTAAAGAAC 
GTGCCTTCTG 
TTTTTTTGGC 
ATTTCATAGA 
CATGTTGTTT 
AAGCATGTTC 
TGGCTGGATC 
TTTCTATTTT 
AAGGTACCTT 
CTGAGATAAA 
CGAGCAATCG 
AAAGCTACGA 
GACCGATGGA 
AATAATTTTT 
TCCTCGCCAT 
AAAAGATAGC 
GGAGACTGCG 
TGTAACACAA 
TCGGAGATTG 
GTGCCCTCGG 
ACCATCGAAT 
ATAGAGTCAT 
GTTCTTGTAA 
TGAGCTTTTA 
GGAACCTACA 



CAATGCTATA 
AACGTGATTC 
AGCACGCGCA 
TAGCTTGGGT 
TTCACTGGTT 
AAGTCTTTAA 
TTCATCAAGA 
TTCTAGAAGC 
TTTTTAATCT 
TTGAATGTTG 
GAAGAAAAGA 
AATGCGGTCT 
CACGCAGGTC 
GGAAATAACG 
AGGGCCGATA 
TCATAGTACC 
CCGCTGTGTA 
TGCTTGGCAA 
TGCACAGTCC 
CGTACGGAAG 
GAGAGCTGAA 
CATCCTTTTG 
TCTTCGAAAG 
ACTGTCAACT 
AAGTCGAATT 
CACAAGTATG 
AATCCAAAAT 
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186251 CTGCATCTGC CATTTCTCCA 

186301 ATCTTAAGTC CTGGTAGGTG 

186351 TTCTTCAAGA TCAAAGAGGG 

1864 01 CTGTTTTTAC AAGGCGCACC 

1864 51 TTTAGCACGT CCTGTAGGGG 

186501 TCC AAGGCCA ' TCAAG AAGCA 

186551 CAGCTTCTTC TTTATTGTCA 

186601 TTTCCTTGGT GTCCTTGTTT 

186651 GGGGTCTGAA GCATGAAAAT 

186701 TATCCCAAAT CTCCTCATTG 

186751 TGGAAAACTA AAAAGTGTTC 

186801 TAACTCAGGG GG AACGACG A 

186851 TTGTTACGGG ATTTACCCCC 

186901 TCGGTAAGAT GGTGAGGATA 

186951 CCAAAGTGTT GTTTTCGCAG 

187 001 AGTGTTGTAG GGAAAAGGGA 

187 051 TTCGTATGGC GTAGCAAGCT 

187101 CCCTGTGAGA GAGCAGCGTA 

187151 TTCCGATTCC TACTGCGGAT 

187201 TCAGTAACTC CAAGGTGAAG 

187251 TTTAGCAAGT TGGCGGTATG 

1873 01 TCATTGAGAA GACAACATCT 
187351 TATTCAATTG CTGAGGCTAC 

1874 01 CATGATTCTT TCGGAAAGTG 
1874 51 TGCCTAGTCG CT.TACATTTC 
187501 CGCAGGAGAC TTTGGGCATA 
187 551 GAACATGTTC CTCTTATCTA 
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GGGCCATTCA CAATACAACC CATGATAGCG 
CTGCGTTCTC TTGCGGATAC GTGTGGTGAC 
TCCGACCACA CATAGGACAG GAGATGTACT 
CCTGCATTTT GTAGAGTGCC AAAGGCAATT 
AAGGTTCGGT AAGTCAAGAA CCACAGCTTC 
GAGCTCCAAA CTCTGTTGCT ATGGAAATAG 
AAGTCCCTTG AAAATACTAG CTTGGTCGGT 
TTCAAAGAAA TCTCGGGAGG TATGAATGAA 
GCACAAATGG AGCTTGATGA ACAGCAGGGC 
TGTTCATATA GGCAAGGCAC TTGATGGTGG 
TCGAAGTACA TCTGTAATAG GAGCATCTTT 
CCCCTTCAGG AGTTGTGAAT GCTTTTTCTT 
AAGTGTTCTA AGAGTTCTTC AGGAGTAAAG 
GAGTTTTAAA AAGACTCCGT AGACGTCTCC 
GCTTCTCTGC AGCAGAAACA AAGTTTTCGG 
TTTTTCTTTT CTGGAAGGTC TAAGTAGATT 
ATCACAGACA GGAATTTCTG TAGTGGGACA 
TGGTATCCCC GAGTCCTTCG GCAAGAAGAG 
TTTATGATCC CGTCCACGCC CATTCCAGCT 
GGGATAGAGC CAGCCTCTAG CATCTAAGTC 
CAGTTACCAT GATCTTCGGA TTGCTAGATT 
CTATAATTCA GCTTTTCACA TACAGCGATA 
CATTCCTTCG ATAGTGTCGC CATATTTTTG 
ACCCGTGGTT CACTCCAATG CGCATAGCCT 
TCTACTAAAG GAGCAAACTT TTCTTCAAGA 
GCTTGCCTCT GTATAGATCT TCGTCCCCTT 
TGTAGTTGCC TGGATTGATG CGAACCTTGT 
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187 601 CAGCAAAATC AGCAACTAAC ATAGCTGCTT GAGGGAAGAA GTGGATATCT 
187 651 GCAACCAAAG GGATATTTAA CCCTAGAGCA ATCAGACGTT CTTTAATTTT 
187701 TTCACAGGCT TGTGCTTCCT TGATTCCCTG TACAGTCACT CTGACAATAT 
187751 CACAATTATG TTCCGCTAGA GCGTAGATTT GCTCTACTGT ACTGTCAATG 
187801 TCTGTGGTTA ATGTCGTTGT CATTGATTGG GTTTTTATTG AGTGGTCACT 
187851 GCCTATGTAT AAGTTGCCTA TTCTTACTGT ATGGGTTTTG CGTCGCGAGG 
187901 AATTGATGGC AGGGGTAATG AGTGTCATAA AAATCTCAAA AATTTCAGAG 
187 951 TTTTATCAAT TATAAAAGCC GAGAGAATTT TGTTGAAGCT GATAAAGCTT 
188001 CTCTAAGGAT GCCTACATTT ATCAATTTTA CAGAACTTCA CCTCAATAAG 
188051 AGGAATTTTC AGAGAGAAAC GACGACCTTG GATGGCCGAG TGTTTCAGGC 
188101 CATCAAGAGA GTATATGGAG AGATTATTGT GAATTAAAAT AATTATTTCA 
188151 TACTTATCAA GGGGATAAAG ATTTTGCTTT AGAGGACGAT TCGTAAGGAA 
188201 CTCGAATCTG TTTTTGTGTT CGTAGAACGA ATCCCGAAGT TTGACAAAGA 
188251 GAAGGAGTTC CCAAGAGGAG TGCTTGGGGC TCGTACTTTG GGAACGCTTT 
188301 TTTCTAATTT GAAATCGGTG TATTTTGTGA TGGATGCGTG ATTTTATACA 
188351 TCGAGATTAA CATGATAGCC AGAAGGATTA GGGAGAGCAA GAATAAGGGC 
188401 CCTATAACTG TAAAGTAGGC TCCGACAGAT GTTGCTACAA AGATCATGGC 
188451 TATAGCAACG ATTCCAGAAA CAAAGGTATT GAGGAGGAGT AAAGCAGTTG 
188501 AAGCCGCAAA GGCAATTTTT ATGCATCGTC CTGATCCCGA TAACCTTTCT 
188551 AGGAATTCTC CAAGCTTTGT TTGTTCAACT GGAGATGCAC TTGATGTTCC 
188601 TGTGACTACT GGAGAAGACA TGTGATTTCC TATATTTCTA TGACAAGCCC 
188651 TATAATTTTA TAGTTTTCTA TTTTAAATGC AATGTAATTA GTGTTTTACA 
188701 AATTAATCTC 'CTACTAAAAT AGGAGGAGAC ACTCTCATAT TTTCTATGAC 
188751 CTAGGGAATA AAATATTTTA TTTATAATGA GTGAAAAACA CCGCTGTTTG 
188801 TCTAAGAAAT ACTTTTAGAA GAGAGTTTAC GTTTTATCTG CTGTTCAGCA 
188851 GTTGTTCCTT CGGTTTTGTT TACAAGGGCA AGCTTTATAA CAAGAAGAAG 
188901 CACACAGAGT ACTGTTGTAA GAATAAGGGC TCCTAGGATA ATCAAGCAGA 
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188951 TGGGAGGAGT CCCAGGAATA AAGAAAGCTG CAATTATTCC TAAAACTTCT 

189001 ATTACTAGAA TGGCAACAAG GATTTTAACT AAAAGTTTAA TCGTCTTCGT 

189051 CGTTTTATCT TGATGCTTTG CTTTTAGAGC TTTGGAATTT GGTTCCATGA 

189101 GGCCTAGATT TCCAGAGGAA GGGCGGTGGC TTGTGGGTAG GGGGGCTGAG 

189151 GACACGGGCA TCGTCTTTTA GTTAGAATTT TCTTTATTTC GCAAAAGCTT 

189201 TGTCTTATGG AGAGCAGAGG CGTTTCTATT TGCTTGAGAA AGTATAGCGT 

189251 GGTTTAGAAA GTTAGAAAAG ATTTTTTAAA AAATCTTCTA TGTTTTGGAT 

189301 GTTTTAAAGA GAGAAGTCTT ACCATTTTTT AAGAAAGGAT TTCCTTTTCA 

189351 CAAAAAGGCA TCTACTCTCT TTTGCTTAAC CAGGGGGAGA GATTAGGGTG 

189401 TGATGAGGGG ATGTCTAGCA' AGAGTTGTCA AAATGATAAC CCACGGTTGA 

189451 TTTTGATTTT CTGCTTCTGA TCCAAATGTC TGCAGAGCAT CTATCAAGGC 

189501 TAACTTCACA CCATCAATCC ATTGCAGACG TAGACTAGTA GTCTCTTCGT 

189551 CTTTTGGAGA GCCAGGGAAA TGGGAGGAAA GCAAGGGGAG CTGAACTACG 

189601 GTTGTTTTAC GGCGCTTGGC CTCATTCAAA CAGTTAAGGT AGGCTTGCCT 

189651 ACAAAAGGTA AACGCATCAT TAGGATTGTA GTTATAGTCA GAAGCTTTCG 

189701 GCCCTAGAAG TTGTCCTAAG AATTGCGGTA GTCCTGCTTT ACCTGCATTC 

1897 51 GTGGTTCCGT TTAGATTTTC CCATTTTGCT GAGCGGCATT CTCCTAATTG 

189801 TAGGGGCTGT TTTGAACGGA GAGGATCCGG AGATTTTTTC GAATTTTCCC 

189851 AACAGCGTAC ACTAGTTGCT TTCGCAAGTG CGAGACTTGT TCCTTTTACG 

189901 TTGTTTGCCA TGTTTGGGGT GGCTGCATTC ACAATCATGA CGATGCCTTG 

189951 AGTTTTATAG CGAGGCTTTT CAATGGGCCC TAACGTAGAG ATCAAAGTCA 

190001 GGTTTGAGTT TTTCAAATGC CAAGCATAAC CGGTTTGATT GTCAGCAGCG 

190051 ATGAAAGGCA TATCTGTGTT AGCCTGTAGA TCAGGTATAC GGTCCCAATT 

190101 TTCTTGTAAC AGTGTTTGAT GGGTTCTAGG GGAAGGAAGC GGGGGCTCTT 

190151 CTAGGACAGC TTCCGTAGGC ACTGGGGTGA GTGCTAGGGG CGGGACAGGA 

190201 GGAAGGTCTG TATCTTTTAT GATGGGCGTC GGCTGATGCG TTACGCTTAT 

190251 AGGACTTTTA GGTTCTCTTA AGAGAAAATA GAGAAGAATA CTAAAAGCCA 
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1903 01 .AGACTGCTAG AGCGGTAAGA ATAAATAATA GAGGAATGCC TAAAATTATA 

1903 51 GCAGCGGTTC CAGAACTTAC AGCAATTCCT ATAATTAGAA TAAAAGCGAC 

190401 AAGGTCCAGA CCTTTGAGCT TATGAGATGC GAGTTGCGTG GGCATGCTTT 

190451 CGCTGCTGAT CAGCATGGTT AGATTTACGT TAGTCAAATT GGGTTCTGTG 

190501 GTGGACATAA AACTTTTTAA ATAAAAAACA AAAAGTTTAA AAAAGATTCT 

190551 TTTTTATGTG AAGTTATTTA TATTTTAAAT AGAAGTTGTT TATTTAAAAT 

190601 AATAAATAGA CAACATTTTC ACTTTAAAAA GTATTGGATG CGGCTTAGAG 

190651 CCAAGAATCT AGGGGGTGCT TGTGATCAGG ATAAATTGGC TGTGTTTTTA 

1907 01 GAAAGCTGTT CGATTGGAAG CACGGACGCT TTCCGACATA GACTAGGGAG 

190751 TCGTTAGATC GAAACGGGGT GGAATGATGG GAGGCTGGTC CTTGTCTGTA 

190801 AGGATGATAA TTTTTTGCTC TTCCTGGTTG TCTTGTTCCC ATCCAAAATC 

190851 TTGAAGAGAT GTGATGAGTG CCAATTTTAT AGCCTCGATC CATTCTGCTC 

190901 TTGTTTTCCC GAGGTTTAGA AGTCTTGATG GAGCAAATAG ATTACATCCA 

190951 ATGAGGGGGA GTTGAATCAC ATCAACGCCT ATGATTTCAG CTTCTTGGAA 

191001 CAGGTTATGA AACGCTTTCT TGCTTACTTC AAATGCTTGC TTAGGATCGT 

191051 TGTTACACTT AGCAGCTTCG GGACCAAGCA GTTGTGCTAA GAAGTGTGCT 

191101 TTGCCTGGGA CATGGTCGTT TGAGGTGTGA TCACTATTTT CCCATTTTGC 

191151 TGAGCGGCAT TCTCCTGGCT GTAGTTGGGA TCCAGAACGA GAGTGCGCTC 

191201 TAGGGAGCCT AGATGCGTTC CAACACTGTA GACTTGTAGC CAGGGATAGA 

191251 GCTTTATTCG TTCCCCCTCC TTCTCGGGAG ATGTTCTCGT TTGCTGCGTT 

191301 AACAATCATC ACCCTGCCTT GAGTTTTGAT TCTTGGAACT GCAATATCTC 

1913 51 CTGAGGTAGA TATAAAGATA AGCTTCGAGT CTTTAAGCTT CCAAATAAAA 

1914 01 CAGGGCTGCT GAGGTGTTTC TGTAGTAAAC GATGGGTCTA CAGCGGCTAA 
1914 51 GCTCGGAAGA AGATCCCAGT TTTGTTCAAG AAGAGCCTCA TAGCCAGGAG 
191501 TGACCGTGAT AGCGGTAAGT TCTTCTGGGG GTGTGGTTAG ATCAGCAAGA 
191551 GAGACTTCGG GGAGTGATTC GGGATCGGGA AGTGGATCGA TCTGCAAAGG 
191601 GAGGTCCTGT TGCTCAGGAG GCTGGGGAGA GGGAGACAGA GAACTTTGCT 
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191651 


CAGATTCGGG 


TTCGATTTGA 


GGAAGAAT C 1 


CGI Ai AAl 1 i 


AHHTTTCTTT 


191701 


AAAAGGGAGT 


AAAGTGAGAC 


r^r^/^Ar^f^A a*T"T* 


rr'TA AT(^TT2X 


THAPAACTAC 


191751 


GGCAGAAAAA 


ATAGGCATACj 






PTAAAGCCCA 


191801 


AAGCGCACGC 


TAGGGCTAGl 


(ji i LjAvjoo i vjA 


V— 1 i i. 1 V-Mo 


TATTTTCTCG 


191851 


ATTCTATCTG 


TACGGTTGAG 


ACjCjLjAVj i U 1 a 


ATPPriATAHfi 


AATC^TTTCGC 


191901 


AGGAGTTCTG 


TAGAGACTGG 


Uvj» 1 i i A 1 A 




nCATTAGAAT 


191951 


CTGTCATAAT 


ATTTTATTAA 


rp A O A T*/^ TV A TXT* 
1 AvjjA i L. AA 1 I 


TTPA APT ATT 


ATATTACTAA 


192001 


TTTGTATTTT 


TATTAGATTT 


1X1 1 Al AAAA 


PA A A ATTAfiT 


TTATTAATGC 


192051 


ATCCTATTGA 


AATAGATCTT 


TiTtT A OT"T* AAA 
111 A(j i 1 AAA 


Apnr: A PP AT A 


AHTAGCTGTT 


192101 


TGTGGTCTGT 


AAGGATTATA 


GTC A 1 (jtjbALj 


fwr* APP/^PTf^ 


PTnrnCAGCA 


192151 


AAGGAGCGAA 


GAGCTTCCAT 


AAGACC 1 1 1\- 


TTT' ATA TP A T 


TTAPPCAPTA 


192201 


CGTACGGATG 


TGGTAAAGCT 


TATATGCACT 


PPT* ATT APPP 
GGlAl lAooU 


TTTPTTTHHT 
X X X v3 X X X 00 X 


192251 


TTACGGGTTC 


TAGTTCCAGC 


rn(T>rn/^^<T>00 A O 


P'PP APT AT AT 
\j X G Ao i A i A i 


AHAHf^AAGAG 


192301 


ATCAGAGGCA 


CTTGGACCAC 


AGTGvjL 1 i 


TT A TTP A P A 
I 1 A 1 i oAVj A^j 


PTTPATCAAA 


192351 


ACAGTTCAAA 


TAGGCTTTCT 


m A A n*» A A O A *T>rp 

TAATAAuAi 1 


Ljx- J. i AA 1 i. 


TPAPPATGTG 


192401 


CTTTCAATTC 


TCCTTCATAT 


rp<T>AO/^A^^A A 

TT AGGACLAA 


P A A PTTPTPP 
GAAo 1 J. 0 X 0^ 


T A An AAATGT 


192451 


GCTTCTCCTG 


GGTTCGTATC 


ATI 1 Ai H-G 1 


PP APTPTPT A 


TTGATCCAGG 


192501 


GTGCTGAGCG 


GCATTCACCC 


A/^A/^AfTA A T*^ 

AL ACjA 1 AA 1 U 


PTTTPPPAf^T 


GTTTATTTTT 


192551 


CCCCCAGATG 


TTCTCGTATT 


OfPT^O/^ A A 0 A A 

\j \ i CL. AAC AA 


pT ApnnTnTn 


TGGCTGCTGA 


192601 


GAGAGCAGCA 


TTGGTTCCGG 


CTCCACuAGA 


T"T'PP ATPTTP 


nAATTPGCTG 


192651 


CGTTAACAAT 


CATGACTCTT 


CCTGAGGTTT 


TCAGGCGTGG 


TTTAPCGATG 


192701 


TCTCCTGTAG 


TGGCTACTAA 


GGTAATTGGG 


GCTCCTTGAT 


riTTCCCAGAC 


192751 


ATAGTATCTT 'TGATTAGGAT CTTGGAGAGT CCAGGATATA 


TTGATTTCTG 

X X V7c^ X X X \_> X w 


19280*1 


AGAGAGTATT 


GACTAGGGTC 


CATTCATTTC 


TCAGCAGCTT 


TTGGTAATCC 


192851 


AGAGGGACTG 


AGGGTTGCAC 


TTGAGCTGCA 


ATATCCTTGG 


GGACATGTTT 


192901 


TTGGGGCCCA 


AGGAGCTCTT 


CTGTCGGTTT GGATGGCTCT CTTCTTCTTA 


192951 


AAAGAAGACA 


AGAGAGTACG 


ACTGCTGCAA 


GGAGAGCAGC 


ACCAGTGGCT 
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193 001 ATAGCCATGA GAGGCATACC 

193051 AGCTATGGCT AAAGCAAGAA 

193101 TCACTTCTTT CTTAGTAAGC 

193151 ACAGACAAAT GTGATTGATT 

193201 AGTCATAAAA TATCTATTTA 

193251 TTTTACAAAA AATCTTTTTC 

193 301 TTAATTTCAT TAATTTTAAT 

1933 51 TTTAGTTCAT AAGAGAACTC 

193401 ACTTGGTTCA AAAGGGTGGC 

193451 CCAGCCATGG GATTCAAAGA 

193 501 CTAATGGGGG GTGAAGACAT 

193 551 CATCAGGTTT ACTTTCGAAG 

193 601 AGGGGGGGGT AGGGAGAACG 

193651 TTGTGAAATA TTTTCATTTA 

193701 CACAGGCAGT GGCTTATCTT 

1937 51 AAGAATTGCC GCTTTTTGGG 

193 801 AGGTCAATCA CAAACTGGGT 
193851 TAAGGTTAGG AGAGCTATAA 
193901 TGAAAAGATA GTGCCTAGGG 
193951 CATAGTCATC AACCGTCTTA 
194001 GGCACAGCTG CTAGAAAGCA 

194 051 TGTTCCTGAG ACAGCAAGGA 
194101 GAGTGAATTG CAGAGCAGGG 
194151 CAGCATCCGA TCACTAGGAG 
194201 TGGAGAAGAC TCAGCTTGCT 
194251 TTTGTGTGTT TGCGCTATTG 
1943 01 ACGTCCGTAG CAGATACTGG 



AAGAACTCCA GCAACAACTG CAGTCCCTAC 
TAAGAGCTGT GAGTTTTATT ATTTTAGCAA 
ACGGGTGATT CTGAGATCAA TCTCTCATCT 
CTCGTGTACA TCTGGGATTG AAGTGGTTTT 
AAGGAAAAGA TTATATCTTA ATTTGTTGTT 
TTTGATAATA AAGATTTTTA TAAATTCTTT 
GAGTTAAGGC GTAGCAGGCG CACCTTATTG 
TAAGTGGGTG CCACACAGAG TTTTTTCTCC 
GTTTGTGGTT TTATGGGGAA CGATTCTACC 
CTTCTTAGAT CGACAATACT ACTTGAGATT 
GTGTATTTCA TGGGCATGGG TAACGGAGGA 
GCCCATATAT CATCTTCAGA GGTCAATAAA 
ATGTTTTACA ACAGTGTAGA AATTAACTAT 
ATAACAATTT AATGAGTTTT AATTTGAAGA 
TTGTCTATAA GTCGGTACAG TCAGACTTAT 
TCTTCATAAT CTCTATCTTA AACAAAAAGG 
GAGTTGTAGA GGTCTCAATA AGAAAGTTTG 
TTGCGCCTGG GATTCTGCAT AGACGATACC 
TCGGTGTTTG CTGATGAGAA TGCCAAGAAG 
ACATATTTAT AAGTTTTCTT AGCAGCGAAT 
AAACAAGATG ATAACAGCCA CCAATAGAAT 
CTGCAATGGC AGCTGGATGT CCAGTTTCTA 
CGGATTGCGA ATAAACTAAG AACTATGAGG 
AACTTCCAAT ATGACAGAAG TGATTGTCCG 
GCTTCCCCTT GAGTTCTATG AGTTTCTGAT 
ATTAAGGGAG CGGCTGGACT TTCAAAAAGA 
AAGATATCCC ATTTATTTTA ATATCCTTTA 
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194351 
194401 
194451 
194501 
194551 
194601 
194651 
194701 
194751 
194801 
194851 
194901 
194951 
195001 
195051 
195101 
195151 
195201 
195251 
195301 
195351 
195401 
195451 
195501 
195551 
195601 
195651 



ATTAAAATTT 
AAAATTCTAT 
TTTAATTTAA 
AGTACCTAAG 
CGTGAGAGTG 
TTGCGTCATG 
CATCTCGAGA 
CGTTCGCGAA 
GATAGAGATC 
AAGGATCTTT 
GGCCATGTAT 
ATTGCGGGGG 
GGGATTTGGA 
GACATAGAGT 
TTTGTGGGAG 
TCTCCTGAAT 
CTGAATTGCC 
AGGAGGATAT 
CCTAAATTTT 
TAGGAGGGAT 
AAGGGTTATC 
TTGGGTTGGT 
AGAAGAATAC 
GTTTGCCACA 
ATAAAGTTGC 
GATAACAGGA 
CTAAGACAGC 



GCGCAGATAT 
ATTTTCTTTT 
CTTAGAAGAG 
AATTCGTGCG 
TTTTTTCTAT 
GAAAGATCTT 
TAACCAAACT 
TCTACTGTAC 
CTAAGAGAGA 
CAACACAAGC 
TTTACTGCTG 
CAATTCTCTA 
TAATTTGGGT 
GCATAGTAGT 
GAAGACGCGG 
CTCCCGCAGG 
TCATTGAGCA 
AGGCGCATTT 
CGGAAGCTGA 
ACTAGGGAAA 
TGTGAGTAAT 
GTTCCCAGTT 
TGTAGGTTGG 
AGCACGCGGT 
TAAGAACAAC 
ATATTTAAAA 
AAAGATTGCA 



TATACCTTAA 
GTGTAATATT 
AAGTGTTGCA 
TAAAATATTA 
TATTGAGGGG 
GGAGGCTTTA 
GAAAAGATGC 
CCTTGATTTT 
TCCTCAATAC 
AATAGTTAGC 
TAATCATAGC 
GAATAGAGAT 
GTTCGTATTG 
ATTGGTAGAT 
CCTCGGAAAT 
GAGTTCTGTA 
AGGATTTTGG 
TGCGAGGTTT 
AATTAAGAGA 
ACAGTGAATG 
GCAGAGGGTG 
TTCTTTAGAA 
TGGAGACCCT 
TTTAAGAGGG 
AGCAATAAAT 
GCAAACCAAT 
GTAATCAATA 

145 



TTTAATTTTT 
TTTTTATAAA 
ATATCTTTTA 
CGTATTTGAT 
CTATTTTGAT 
GGAGCCTTTT 
GTACCAAGAG 
AGGATAGCCG 
TTTCTTGAGG 
GGATATTCTG 
CAAAGATTGT 
GCTCCCTCAA 
ATCGCAGTCT 
GGCTTCAGGA 
CGGGTTGCCA 
TTTATTTCTA 
GGAGGGAGTG 
CTTCAATTAA 
TGTTCTGGAT 
TTTAGCTTTC 
CGGGGAGATT 
CGATTGATTG 
TTTTGGTTTT 
TGGTTGCTCG 
GAMTTCCTG 
AAGTAGGGTC 
GAATTCGAGA 



ATTAGAGAAT 
AACATGAGTT 
TTAAAACAAG 
TTGTATAAGT 
TTTTCTTGAA 
TGTCCCTCAT 
TTTGTTTTTT 
TTTTGAGTGA 
TAGTTGGGCT 
GGTGCAGCTC 
TGCATTCTTG 
GCTGTATAAA 
GGATATAAGT 
GTTGTTGGTA 
GTATAAACGT 
CCCTGGTTTC 
GTATCTACGT 
CAGAGTCTTT 
TGCCTCCCGG 
CATATCTCGT 
CTGTAGGTCC 
CGATAGAGAT 
ATTTCTTTGT 
TTTATAAAGG 
TGAGGAAATA 
CCTATAGTAA 
CTGAATCTTA 
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19 5701 TCAAAGTTGC TGCAGGATTT TTTTTTAGTG TCTACTAGGG GATGATCTAC 

19 5751 AGTTAAAGAT ATGGGTGAGA TTGTAGCCAT AATTAAGGAA TTCACCTGAG 

195801 TTCCTTTAAA TTATATCTCT TTTATTGATA TTTAAATAAT TAAAAACATT 

195851 TCTGCATTTA AAAATATTTT TTAATTTGTC ATCTTTCTAA GTAAATATAA 

195901 AAAATAATAA GTTTATTTAG TTAAATTTTT TTTGAAAACA AGAAAGATAG 

19 5951 GGGAAAGGCT CGCGAGTAAA TCAAGCCTTT CCTGAATGCA AAGAGTCGCA 

196001 TAACTACTTA GAAAAGAGTT CGTCAGAATA GAAGTTGTAT TAATATAAAA 

196051 TTTTTATTTT CTAAGATTAG AAAGTAACTT TGACACAGCC ACCTTTGATG 

196101 CGGCACTGAC AAGCAAGACG TTCGTTAGAG TCTTCGGGTT CTCCTAGAAA 

196151 ATCGTATTCT GGTTCCGTAA ACTCAGAAAG ATTCTCACGT CCTTCTAAGA 

196201 CCTCTATCAC ACAAGTTCCA CAGACACCTT CTGTACAAGC AAAGGGAATG 

196251 CCCATGGATT CACAAGGCTC TGCGATCTCA CTATTGTCTT CTAACTCGAA 

19 6301 CTCTTGTTGT TCATCATCAG AGGTAATGAC TAGCTTGGCC ATGGAGTTTT 

196351 CCTTCTACGT ATTGTCGGAT AGATTAAAAT GAAGTTCGGA GTAGAGGGAT 

196401 TCGAACCCCC GACCTATTGC TCCCAAAGCA ACCGCGCTAA CCAAGCTGCG 

196451 CTATACTCCG TAAAATAAAA TTTTCGAGTA AAAGAGATCA ACGTTAGCAT 

196501 AGAAGGAAAT AGTCGACAAG ATCAAAGATG CTTCATAACG TCTCTTTAGA 

196551 GTTTTACTTG CAGATATATG GAAGAGGGAA GTATGAGAAA AGGCAGGGAT 

196601 TCTCTAGGAG GCTGTTTTTG TGTCTGGGAA GAAAGATGGT GTAAGGGGAA 

196651 TGATCTTTGT CCCTCTTAGC ATCCTAGTAC TAATCTTTTT ACCTCTTCCT 

196701 CAGATCCTTC TTGATTTTGG ATTGTGTATT AGTTTTGCAT TGTCTTTACT 

196751 AACGGTCTGT TGGGTCTTTA CCTTAAATTC AAGCAATTCA GCGAAGCTTT 

196801 TTCCTCCATT TTTCTTATAT CTTTGCCTAT TGCGGTTGGG ATTGAATCTT 

196851 GCATCAACAC GATGGATTGT CTCTTCAGGA ACCGCCTCTT CTCTGATTGT 

196901 TTCTTTAGGC AGTTTCTTCT CTTTAGGAAG TCTATGGGCA GCAACGTTTG 

196951 CGTGCCTCCT TCTTTTCTTT GTGAACTTTT TGATGGTTTC AAAGGGTTCG 

197 001 GAAAGAATCG CAGAGGTCCG TTCGCGGTTT TTCTTAGAGG CTCTTCCAGC 

146 



wo 00/27994 



PCT/US99/26923 



197 051 AAAACAGATG GCTTTAGATT 

197101 AGGCTGTCAA AAAACAAAAA 

197151 TCTGCCATGG AGGGGGTCTT 

197201 TTGTATCCTT TTACTCGTGA 

197251 CTTCGGGTTA TGCTCTTGAG 

197 301 TTAGTGAGTC AAGTACCTGC 

197351 TATTAGTAAA ATCGATAAGG 

197401 ACTACAAACA GTTGCGTCAG 

197451 TCTTTGTGCT GCATTCCCAG 

197 501 GAGTCTTTTA TGGTTGGCGT 

197 551 CTTGTATAGA ACGTGCGTTC 

197 601 CAAGAATCAC AGTTCTATCA 

197 651 TGAAGATTTA GGAGTTAGAT 

197701 AGCGTCCTTG GCTCCGAGTA 

197751 ACTCCAGAGG CTGTGCTTCC 

197 801 CAATGCCGAG GTAGTTCAAA 

197 851 GCATCGCTGT TGAAGACATC 

197901 GTAGTTCTTT CTCGCCTCCT 

197 951 CCCAAAGATT CTAGAGGCCG 

198001 TGGAGATCCT TGCGGAAAAA 

198051 AGAAGTCTCT GGGATCAGAA 

198101 TCATGTTGAA GAATTGATAA 

198151 TGCAAGAGAA TGTGATCCGT 

198201 TTTAAAGATT TTCGAGCCAT 

198251 GAAAAAAATG CTCGACCCAC 

198301 ATGAGCTTCC TAAAGAAATC 

1983 51 GAGGTTTTAG TTCCTTAATT 



CTGATCTTGT TTCTGGAAGA GCTTCTTATA 
AATGCCCTTA TAGAAGAAGG GGATTTCTTC 
TCGTTTTGTT AAAGGGGATG CAATTATTAG 
ACGTAGTTTC TGTAACTTGT CTTTATTATA 
CAGATGTGGT TTACAGTTTT AGGAGATGCT 
TTTACTTACT TCGTGTGCTG CAGCCACTCT 
AAGAGAGCCT TTTAAATTAC CTGTTCGAAT 
CATTTCAGGG TGGTGTCGTT ATTGATCTTT 
TTCTCCAAAA TTCCCTATCG TTTTGCTCGC 
ATCGAAAAGA AGAGCCTGCA TCAGAAGATT 
TCTTATGTTG AGGGGGCCTG CCCTAAGGAA 
AGTATATCGT GCAGCATCCG AAGAAGTATT 
TGCCTGTGCT TACTTCTCTA CGTATTGAAG 
TTTGGCCAGA ATGTATACTT AGATGAAATG 
TTTCCTTAGA AACATCGCTC ATGAGGCTCT 
AGTACCTTGA GGAATCAGAG AGAGTGTTTG 
GTTCCTAAGA AAATCTCTTT AAGCTCTCTT 
TGTTAGAGAA AGGGTATCGC TTAAGCTTTT 
TTGCGGTATA CCAAAATTCT GGAGACAGCT 
GTGCGAAAGT CTCTCGGATA TTGGATTGGG 
ACAAACCCTT GAGGTAATTA CCATAGATTT 
ACAGCTCATA CTCAAAGTCT AATCCTGTAA 
CGAGTAGACA GTCTTTTAGA ACGGTCGGTA 
AGTTACGAGC TGTGAAACAC GATTTGAGAT 
ATTTCCCTGA TCTTTTGGTT TTATCTCATG 
CCTATTTCCT TCTTAGGGAT CGTTTCAGAT 
TAATTTAATC TTCTGTAAGC ACTATTACTT 
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198401 TCGTATTTTT TTTGTTGGTG 

198451 AATCTAATAA ATAACTAGAT 

198501 CAAAACATCA TAGAGGTTTG 

198551 GTATCGCGAT AGCTTAATTG 

198601 TTCATCGTTT GATTTCAGGG 

198651 TATGCTTCGG GTGTTGAAGG 

198701 TGAGAGAAGT CGTCGTTTTG 

198751 CCATTATTGA TGATCTGCGT 

198801 CAAAAAGCGA ATAAATTGTC 

198851 AGGCAAGGAA CCCACGGATC 

198901 AACAAGAGCT TTCGGGATGG 

198951 TCTCTGAATG AAGAGTGGCC 

199001 TCTTGAAGAG AGAATCCCCG 

199051 TAGATAAACA AGAATTTTCT 

199101 GAGGAAAAGG AACGCAAGGT 

199151 CCTTAAGGAA ATCGGTAAGG 

199201 AAATTCACTC TAAAGCATTG 

199251 CGATAAATAC AGTTCTCAGG 

1993 01 CCCTCCTACG ACACAGAGTA 
199351 TAACAATGCA GCCAGCAATT 

1994 01 ATGCGGGGCA GCCATTCTTT 
199451 GGGTGCCTTG CTAGTTGTGG 
199501 ACGGAACCCC AATTGGAGAT 
199551 ATAGAAGTTA TGGTATCTCA 
199601 TTCTTTATTG TTAGATAGGT 
199651 AAGATAGAGC TTGGCAATTC 
199701 GGTTACAATC TTTACAAGAG 



TAAATATTGT TTAAAAATTT TTTTTGAATT 
AAAAAAAAAT TTGTGAAAAC ACAGCAAACT 
GAACTTCTAC TGGGAGACTC AGGAAATAGA 
AGTTCTATTT GCCTTTAGTA AAAAGTGTGG 
ATGCCTTCCC ATGTAAAGAC CGAGGATTTG 
TCTCGTCCGT GCGGTGGAAC GTTATAATCC 
AAGGTTATGC GGTATTTCTG ATTAAGGCTG 
AAGCAAGACT GGGTTCCTCG TAGTGTCCAT 
AGGAGCTATG GATTCTCTTC GCCAGTCTTT 
TTGAACTGTG TGAGTATCTC AATATTTCGC 
TTTGTATCTG CCCGTCCTGC ATTAATCGTG 
TTCACAAAGT GATGAAGGAG CCGGAATGGC 
ATGAACGTGC CGAGACAGGG TACGATGTTG 
TTATGTTTAG CCAATGCGAT TCAGGAACTT 
CATGGCCCTG TACTACTATG AAGAACTTGT 
TCCTTGGGGT AAGTGAGTCT CGCGTCTCTC 
CTTAAGCTTC GCGCAGCACT CTCTGCATTT 
TTTTAAGAGC AGTCCTAGAG CTAGGAGAAG 
ATCCGTAAAG AATTTGTTTA GTCAACGCCA 
GTGAAGATTG CTCCTACAAT CAAGAGAAGA 
AATACGTTCG GCTAGGGTAG AGGTAGGAAG 
CATTTAGACT AGTTGCGGCG TTTAGCATCG 
GTAGACATAG CGACCTAATC TTTAGAGAAG 
GCAAAGGAAT TATTCTCAGC GTACAATCTT 
GGTGCCTTGC ATATAAGCTT TAGGATCGAT 
TTAATTATGT ACGATCACTC ATGCAATCCT 
CGAAATATTT TAGAGAATTT TACCGCAGGT 
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1997 51 TTGGAATCCG TAGAGGGACC 

199801 CGCACCTGCT CTACATATTG 

199851 GACTCGCTGC TCTGGGGATT 

199901 GGTATGGTTG GAGATCCCTC 

199951 GACAAGTGAA GTTTTTGATA 

200001 GCTATCTTCC CGGGGTGACT 

200051 ATCTCCCTGA TTGATTTCTT 

200101 CCAAATGCTA GTGAAAGATA 

200151 GAATTAGCTA TACCGAGTTT 

200201 TATCACTTAT TTAAAAATTA 

200251 TCAGTGGGGG AATATTACTT 

200301 TGGGTCAGGC CTACGGCCTT 

2003 51 AAAAAAATAG GGAAAACAGA 

2004 01 AACCTCTCCT TTTGAGCTGT 
200451 CCATCCCTAA AATTGCTCGT 
200501 CAAGATATTG ATAGGCGTGT 
200551 TGTAGCCCAA GATATCTTAA 
200601 AGGCTCTTTC TGTAACTCGT 
200651 TCGGAAAAAG ATTTTCATGA 
200701 GGATAAATCC GAGGTGTTAG 
200751 TGGGACTATG TAAATCTAAA 
200801 GGGGTATATA TTAATAATGT 
200851 AGAACAAGAC 'ATCTGTTATG 
200901 AACGAAAGCT TGTTCTATAT 
200951 TGCAAACGAA TATTGGTCTT 
201001 GTCTTAAACA TGATAGATCA 
201051 CCCAGAGAAA ACGCGGGACT 



TATCGCCGCT TATTTAGGAT TTGATCCTAC 
GTCATTGGAT TGGGATTTGT TTCTTGAAGA 
ACCCCCATAG CTTTAGTCGG GGGAGCCACA 
AGGGAAACAG AGCGAGAGAT CGTTACTTCA 
ACAGTCAAAA GATCACGGCG TGTCTCCAGC 
CTTGTAAATA ATGCAGACTG GTTGCAGGAG 
AGGGGATATA GGAAAACACT TTCGTTTAGG 
CAATAAAGCA GCGGGTGCAT TCTGATGAAG 
AGCTATTTAA TCCTGCAATC CTATGATTTT 
TGGCACGATC TTGCAGTGCG GTGGTAGCGA 
CAGGAATCGA TTTTATTCGC CGTAAAGGGT 
ACCTATCCTT TATTAACGAA TGCTCAGGGG 
GTCGGGAACT GTATGGCTCG ATTCAGATTT 
ACCAATACTT ACTCCGTTTG CCCGATGATA 
ACGTTAACTT TATTGAGCAA TGAAGAAATT 
ACAGACGGAT CCAGTTGCAG TGAAGGAATT 
GTGCTATTCA TGGAGATCTA GGGCTTGAAG 
AGCATGCATC CAGGGAATCT TTCATCCTTA 
ATTGTTTGCA GGAGGGATGG GGGCCTCATT 
GGAAACGTTG GTTAGACCTA TTTCTTGTTT 
GGGGAAATTC GAAGGCTAAT TGAACAAAAA 
GCCCATCGCT AATGAGCATA GTGTTTGTGA 
GTCACTATGT GTTGTTGGCT CAAGGTAAAA 
TTAAATTAGT TTAAGGAGGC TAGGTAGCTT 
ATTGGCTTAG CTGTCATGGG GAAAAATCTT 
TGGTTTfTCT GTCTCTGTCT ATAATCGGAC 
TCTTGAAAGA ATACCCTAAC CACCGAGAGC 
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201101 
201151 
201201 
201251 
201301 
201351 
201401 
201451 
201501 
201551 
201601 
201651 
201701 
201751 
201801 
201851 
201901 
201951 
202001 
202051 
202101 
202151 
202201 
202251 
202301 
202351 
202401 



TTGTAGGGTT 
CGAAAGATCA 
TCATGCGTTA 
GGAATAGCTA 
AAGGGGATTC 
ACGTCACGGC 
TAGTGGCTCC 
TGCTGTTCTT 
TCACAATGGT 
GTATCTTAAG 
TTGAAAGAGT 
TTCTGAAGTC 
CGATTTTAGA 
GATGCTTTAA 
TGCTCGTTTC 
ATTATCCAGG 
TTCATACAAG 
TGCTCAGGGA 
GATTAGACCT 
CAAAGTGCAT 
GAATACCTCG 
CGGAGATGGG 
CCTATTCCCT 
AGCAAGCTCT 
CTCATACCTA 
GATTGGGTGC 
CTCGAGAAGA 



TGAATCTTTA 
TGTTGATGAT 
CTGCCTTTTC 
TTTTAAAGAT 
TCTTCTTAGG 
CCATCAATTA 
TATTTTTCAA 
GGGTAGGAAC 
ATAGAATACG 
AGATTTCCTA 
GGAATACTCT 
CTAGCATTGA 
TGTCGTGGGC 
ATTCTGGAGT 
CTTTCTTCTT 
AACCCCCTTA 
ATGTCTTTCA 
TTCATGCTTT 
AGGAGAAATT 
TTTTAGATGT 
CTCATCTTCC 
ATGGCGTAGA 
GTTTAGCAGC 
TCAATGTCGT 
CGAGCGTAAC 
ACACGAAAAC 
CCCTAGGTAG 



GAAGATTTTG 
TCAAGCAGGG 
TAGAACCCGG 
TCCGAACGAC 
CGTGGGGATT 
TGCCTGGAGG 
TCAATAGCAG 
TGGCGGTGCA 
GCGATATCCA 
AAGCTCTCCG 
AGAGTTGGAA 
AAGATCCGGA 
CAAAAAGGTA 
TCCCCTTTCC 
GGAAAGAGAT 
ATATTTGAAA 
TGCTTTATAC 
TAGGAGAAGC 
GCTTTGATGT 
TATACATAAA 
AAGAATATTT 
ACAGTAGTGA 
AGCAATCAGG 
TAGCTCAAGG 
GATCGCCCTC 
TACAGAAAGA 
CTCGAGATTT 
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TGAATTCATT 
AAACCTGTGG 
CGATGTGATT 
GATGTAAAGA 
TCTGGAGGAG 
AAATCCTGAG 
CAAAAGTACA 
GGCCACTATG 
GTTGATATGC 
CAACTGCCGT 
AGCTATCTAA 
AGGAATCCCT 
CAGGAAAGTG 
TTAATCATAG 
ACGCGAGCAA 
TGCCCCATGA 
GCTTCCAAGA 
TTCAAAAGAA 
GGCGCGGGGG 
GGATTTGCTG 
CCGTGGAGCA 
CTGCAATTGG 
TTTTATGATG 
ACTGCGAGAT 
GAGGAGAGTT 
GTGAAGTAAA 
AGTTCACCAC 



GGAGAGACCA 
ATCAGAGCAT 
ATCGATGGGG 
GTTGCAAGAA 
AAGAAGGTGC 
GCGTGGCCAT 
GGGCCGTCCC 
TAAAGGCTGT 
GAAGCTTACG 
TGCTACAATT 
TTCGTATTGC 
GTTATTGATA 
GACCGCAATC 
GAGCTGTTCT 
GCTGCCCGTA 
TCCCTCGGTA 
TCATCAGCTA 
TATAATTGGG 
ATGCATTATT 
CCAACCCAGA 
TTACGCCATG 
TGCAGGGCTA 
GCTATCGTAC 
TATTTTGGAG 
CTATCATACC 
AATAAAAAAT 
TACCGAATTT 
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2 024 51 TCAATTGTAA GCAGTGTGCT GATTTTAAGC GTCAATGTTA ATCTAATTTT 

202501 AGAACTTCAA TGAAAGCTGT ATTGGGAATG GAAACTTTTC CAAATTCCTT 

202551 CATACGTTTT TTTCCTTTCT TTTGCTTTTC CCACAGCTTG CGTTTCCTAG 

202601 TAATATCTCC GCCATAACAC TTTGCGGTCA CGTTCTTAGA AAGCGCACGA 

202651 ATCGTTTCTC TGGCAATGAC TTTTTTGTTA ATGGCAGCTT GGATGGGAAT 

2027 01 CTTGAAGAGT TGTTGTGGAA TCACGTCCAC AAGCTTTTCG CAGATACTTC 

2027 51 TTCCACGAGA TTCTGCTTTA TCTCTATGGA CTAAACAAGA AAAAGCATCT 

202801 ATGGGCTCCT CGTTAATAAG AACCTCTAAT TTGATGATCG ATCCCTTACG 

2 02 851 GTAATCCCCA AGACGGTAGT CAAAGGATCC ATAACCTTTA GTTACTGACT 

202901 TCAGCTTGTC ATTGAAATCC GAGACAATCT CATTTAAAGG GAGTTCGTAA 

202951 GCAAGAACTA GACGGTGCTG ATCTAGCATT TCTGTTTTTA CGC AGATCCC 

203 001 ACGTTTATCT AAACAGAGGT TCATAATGTT GCTCAGATAT TCTTGAGGGG 

203 051 TGATAATATT CACATGAACC CAAGGCTCTT CCACATGCTC GATGATCGCA 

203101 GGATCCGGAT ATCCTGAGGG GTTATCAATA TCTAGAACTT TCCCGTTTTT 

2 03151 TAAGACGACT TTATAGATGA CACTTGGAGC CGTTGCAATA ATATCTAAGT 

203201 CAAATTCTCG AATGATTCTT TCAAAGATAA TCTCAAGATG AAGAAGTCCT 

203251 AAGAAGCCAC AACGAAAACC AAAGCCTAAA GAGTGACTGC TTTCTTGTTC 

203301 TATAGTTAAA GCAGAATCAT TGAGCTGTAG TCTTCCTAAA GCATCTTTCA 

2033 51 AAGTATCAAA ATCAGAAGAA TCTATAGGAT AAATTCCAGC AAAAACTACC 

2034 01 GGATTGATCT CTTTGAAGCC TTCCAAAGGA GTTTTTGCAG GATGTTTTGT 
2034 51 TTTCGTGACT GTATCGCCGA TCTTCACATC CTTCACTTTT TTGAGATTGG 
203501 CAATAAAAAA ACCCACCTGA CCAGGGCGTA AGGAACCTTC TATAAATGTT 
203 551 GCTTTAGGGA GAAAGGCCCC TATACCTAAG ACTTCAAACG AGGAGCCTTT 
203601 AGCCGCCATA AAAGTAATGC GGTCTCCTTT TTTTAATTCC CCGCTAATAA 
203 651 TGCGTACGTA GACCATAATG CCAACGTAAG GGTCATAATG AGAATCAAAG 
2037 01 ACTAAAGCTT TAAGCTCTGT TTCTGCAGGT GCTTTTGGAG GAGGAACAAG 
203751 ATCGATAATT GCTTTCAGGA TTGCAGGGAT CCCCTGACCT GTTTTTGCAG 
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203 801 AACAGGCAAT AATGTTCGTA GTGTCTAGGC CTATATAATC TTCAATCTGT 

203 851 TGAGCAATTC TCACGGGATC AGCGGCAGGT AGATCAATCT TGTTTAATAC 
203901 AGGAATGATC TCTAAATCTC TTTCAAGGGC CAGGTAGACA TTAGCAAGAC 
203951 TTTGTGCCTG CACCCCCTGG GCGGCATCTA CAATAAGTAA GGCGCCCTCA 
204001 CATGCAGATA GAGATCGAGA GACTTCATAC GAAAAGTCCA CGTGACCAGG 
204051 GGTATCAATC AGGTTCAGTT GATACACCTC TCCTTCATAT AGATACGTCA 
204101 TGGTGACAGG ATGAGCTTTA ATTGTAATGC CACGCTCTCT TTCAAGATCC 
204151 ATGGAATCTA AGAGCTGCTC ACGCATCTCC CGTTCTTCTA CTGTGCTCGT 
204201 ACTTTCTAAA AGGCGATCAG CAATTGTAGA CTTCCCGTGA TCAATATGCG 
204251 CTATGATTGA AAAATTGCGA ATGTTCTCTA TCTTATATTC TTTCAAAATG 
204301 TACTGTAGTG TTATCTAGGT TTATTCCTGG TTTCATAAAG CTGCATTGAG 

2043 51 AAGGCTCCTT CAGACAAAGA CCAGCTTCTC AAATACTTTT GCGGACTGCA 

2044 01 TCTCTTTATT AGCAAAATGA TCATGCGGTC ATCTTTAGCT TTGGATAGAT 
2044 51 GAAAGTATAA TAGATGAGAT GTAGAAACCA CAAGGGTTTA AAGTCGAATC 
204501 TGATTTGAGG TAGAGTTCCA GAGTGGCTAG CAGTAGAAAA AACTAAGTGA 
204551 GAAGAAGTGC TCTCCGTTAG TATGAAACTG ATTCCCACTC AGGATTCTAt 

204 601 AGAAAGGGAA ACCGATTCTA AAAGAGATAA AAAAATATTT ACCATTTACA 
204651 TATGTTCATC TAAAGTCCTT GCGGGTCATT TTTTCAGTCA TTTAGACAAG 
204701 CATAATAAAA TTCATGAAAG CATTGGGGTT TGAGATAGTT AAATCGACTC 
204751 GATCCAAGAG TATAAGAGAG GAATGAGTGT TCTTATGTCC GAGATAGTGC 
204801 TCTTCAATGT AGTGAAGAAC TGGAAAGTCA GGAGATCTAT TTGAGTAGGA 
204 851 GTTCAGGTCT TCGCAATCGA TTTTTTCAAG CTCTGGAGTC TATGTGAAAG 
204 901 AAATTTATAG AAGTAAAATA ATTCCATGAA TTCTAAAATG ATTAGGAAAA 
204951 TAATATAACA CGCTGATACT CAAGTCACAA TGGAGCTTCT ATGGTTAATA 
205001 TACAGCCTGT GTATAGGAAT ACCCAAGTCA ACTATAGTCA GGCTACCCAA 
205051 TTTTCGGTGT GCCAGCCAGC GCTTAGCCTG ATTATCGTTT CTGTTGTTGC 
205101 TGCTGTACTC GCTATTGTAG CTTTGGTATG CAGTCAATCT CTTTTATCCA 
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205151 TAGAGTTAGG AACTGCTCTT GTTCTAGTTT CTCTTATTCT TTTTGCTTCT 

205201 GCTATGTTTA TGATTTATAA GATGAGACAA GAACCTAAGG AGTTGCTGAT 

205251 CCCTAAGAAA ATCATGGAAC TCATCCAAGA ACATTATCCA AGTATTGTTG 

205301 TTGATTTTAT TAGAGATCAG GAGGTTTCCA TTTATGAGAT ACATCACTTG 

2053 51 ATCTCTATTC TTAATAAGAC GAATGTTTTC GACAAAGCAC CAGTATATTT 
205401 ACAAGAAAAA CTCTTACAGT TTGGCATTGA GAAGTTCAAA GATGTACATC 

2054 51 CAAGTAAGCT CCCTAATTTT GAAGAAATTC TTCTACAGCA TTGCCCATTG 
205501 CATTGGTTGG GACGTCTGGT ATATCCCATG GTATCGGATG TCACTCCAGG 

2055 51 AACCTATGGA TACTATTGGT GTGGTCCTTT AGGACTGTAC GAGAACGCTC 
205601 CCTCTCTTTT TGAACGTCGA TCTCTTCTAT TGTTAAAGAA AATTAGCTTT 
205651 GGAGAGTTTG CTCTTTTAGA AGATGGTCTC AAGAAAAACA CGTGGAGTTC 
205701 TTCGGAACTC GTTCAAATCA GACAAAACCT TTTTACAAGA TATTATGCTG 
205751 ATAAAGAAGA GGTAGATGAA GCAGAGTTAA ACGCTGATTA CGAACAGTTT 
205801 GATTCCCTCC TTCACCTTAT TTTTTCTCAC AAGCTCTCTT GAAAGCAAGT 
205851 GCAATTATTT CAATATATGA ATGAGTCCGG ATGGGATTGG CTTTGTGATT 
205901 TTGATTCTCA AGGCGAGGGA TTCCAGTTAT CACGTCTGGT TGGGCTGTTA 
205951 CATTCGTCCT GGGCATTATA CGAAGCAAAA GAGCAATTTT ACCTTCCTGA 
206001 GGTTTCTCTA TTGACCTGGG AAGAACTGAT AGAAATGCAG TTATTAAGCA 
206051 AACCAACAAA ACACGGGGTT GCAAAAGATC TTTGTAATGT ATTTGAAAAA 
206101 CACTTTCAAA GGTTTAGACA GTACCTAGGT TCCTTAGATC TAAATCAAAG 
206151 GTTCGAAAAT ACCTTCTTGA ATTATCCTAA ATACCATTTA GATAGGGAGT 
206201 GAGAAAAAAA TCCTAGGTCA GCTTGGCAGA AATTTTTGAA ATCCTTAAGT 
206251 GTTCGATCTG CATTTTTTTC GGGGATTGTA AAAAGTTTGC CTAATCTGAA 
2 06301 GGGTAAGAGA GAGCTATTTT TCATGGGATT TTTTATTACA GGAAATTCTT 
206351 GAATTAGAGA TTTTTATTTA CACATAATCT ATACTGCCTT CAATAGATCT 
206401 ATATCTAAGG AGTTGGCTAT GAGCATGACG ATCGTTCCAC ATGCTTTATT 
2064 51 TAAAAATCAT TGCGAGTGTC ATTCTACCTT TCCTTTGAGT TCAAGGACTA 
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206501 TTGTAAGAAT AGCCATTGCC AGCCTCTTTT GTATAGGTGC ATTAGCAGCT 

2 06551 TTAGGCTGTT TGGCTCCTCC CGTTTCTTAT ATTGTTGGGA GTGTTTTAGC 

206601 TTTTATTGCC TTTGTCATTC TTTCTTTAGT AATTTTAGCT TTGATTTTTG 

206651 GAGAGAAGAA GCTTCCACCA ACACCAAGAA TCATTCCTGA TAGATTTACT 

2 0 6701 CACGTGATAG ATGAAGCTTA TGGCCTTTCA ATCTCTGCAT TTGTAAGAGA 

206751 ACAGCAGGTA ACATTAGCCG AGTTTAGACA ATTTTCTACT GCCCTGTTGT 

206801 GTAACATATC TCCTGAAGAG AAAATCAAAC AATTGCCTTC TGAATTGCGA 

206851 AGTAAAGTAG AGAGTTTTGG TATTAGCAGG CTCGCAGGTG ATTTAGAAAA 

206901 GAATAATTGG CCAATATTTG AAGATCTTTT AAGCCAAACC TGCCCGTTAT 

206951 ATTGGCTTCA GAAATTTATA TCAGCAGGAG ATCCACAAGT TTGTAGAGAC 

207001 CTAGGTGTCC CTAGAGAATG TTATGGGTAC TATTGGCTAG GGCCTTTGGG 

207051 ATACAGTACA GCTAAGGCTA CAATTTTTTG TAAAGAGACG CATCATATTC 

207101 TTCAACAATT AACGAAAGAG GACGTTCTTT TATTAAAAAA CAAGGCTCTT 

207151 CAAGAGAAAT GGGATACTGA TGAAGTCAAA GCAATTGTAG AGCGTATCTA 

2 07201 CACTACCTAT ACGGCACGAG GAACTCTAAA GACCGAAGCA GGGGGACTTA 

207251 CAAAAGAGAC AATCAGTAAG GAATTGCTAT TGTTGAGCTT GCATGGCTAT 

2073 01 TCTTTTGATC AGCTACAGCT GATCACTCAA CTTCCTAGAG ATGCTTGGGA 

2073 51 TTGGCTGTGT TTTGTAGATA ACAGTACCGC ATACAACCTT CAGCTTTGTG 

2074 01 CTCTTGTAGG AGCTTTGTCA TCCCAAAATC TTCTTGACGA ATCTTCTATC 
207451 GATTTTGATG TAAACCTAGG CCTGTATGTG ATTCAGGATC TAAAAGAAGC 
207 501 TGTTCAAGCA TTTTCTGCTT CTGATGAGCC AAAGAAAGAA CTAGGTAAAT 
207551 TCTTGTTAAG GCATTTGAGT TCAGTTTCTA AGCGATTAGA GAGTGTATTA 
207 601 AGACAGGGTC TTCACAGAAT AGCTCTAGAG CATGGAAATG CCAGAGCTAG 
207651 GGTTTATGAC GTCAATTTTG TAACAGGAGC TAGAATTCAT AGGAAGACGA 
207701 GTATCTTCTT TAAAGACTAA ACCAGGTAAC TAGCTTTTTA GTCTCGAAAG 
2077 51 GGGCTACGAG AGGGAGTAGA ATTTTTCTTA TTCCAAAATA GAAATCTCTA 
207 801 ACCTATCAGC AGGGAGGACT CGTCTTCTTC AGGTTGGGAG AAGACGCTTC 
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207 851 TCTTGAGTGG ATGGACTGTT ATAGGTGGTA AGCTTCTAAA GGTGGAATTA 

207 901 TGAGTTGCTA AATTCTCAAG GAACATCGCT ATATTGGTTG ACTGACGACG 
2 07 951 GATCGTTTTA AACTGATCTA AACTAAGGAG TGTCAATGAA GAAAGAAATG 
208001 CTTCGCTTTC TTTATGAATT AGGCCTTGTG TGTAGAGAGT ACCAATTAGA 
208051 GAGGCAAATT CCATGTCTTG GGGTCGTCCT GTATGGTCTA AAGTCAAGGA 
208101 ACAGAGGTTT TCCCATAAAT CAGCCGGGAC TGTTTTTATT AAGGACATCT 
208151 GCTTCCACGA GTACCTATGC TTAAAAATAC ACAAGAGCAG GTTATAGAAT 
208201 TCATCCTCTT GAAGAGAGGG TTGAGATAGA TGTTTATAGG ATTGATACGC 
208251 CGCATGATAT TTAGCAAAGA GCTGCTGTTT CATAGTATTT AGATCACTAT 
208301 GATTCAGGTT CCACTCGCCA TTTAATGCTG CATACTTAAG ATGTTGAAAA 
2083 51 ACATCACGAG TTAAGACTCT AGCTAAAACT AATGTGTGTA GATTTAGAAT 
208401 AGAGGAATTT GTTGGTCTAA AATCTAAAGG AATCAACCAA TAGGTAGGAA 
208451 GCATCTCTGG TTTTATATAA TCACTAAGGT CTTCTTCGAG GCTGCCTCCA 
208501 ATTTTCTTAA CTCGATGGAG CTCTACAACC TTGCTGCCTG CAGAAATAAA 
208551 CTCTCCTAAC CAGAAAAATG GAAATACCTT TTGTAAGATT TTTGGTAAAG 

208 601 AGGGGAAGAG GGGACGGGAA GAAGCAGCGC TTTCAAGTCT TTTAAGAGAA 
208651 GCAGTACGGA CTTCATTTTT TAAGCGTGCG ATACCCTCGA ACGTATCTAT 
208701 TTTCTGTTGT AGTTCTTCAG ATACGTTGTA ATTTGTAGAT GATCCAACTT 
208751 CAGAGTGCAA TTGATTTAGA AGATCAATAA AACTTATGAG ATCTTTAAGA 
208801 TTTGGTTTAG CTTCTGAAAC AAAATCAGAG ACAAATTTAG GATAGTGCGC 
208851 CCTGATTCTA TTTACGAGTT CTTGGGGGAA TACTTGTTCT TTTGATGAAA 
208901 TCATAGGCGT GATTTTTTTT ATTCCTAAAA TCACACCAAT CAAGGCTATT 
208951 AATAATCCTA 'ATCCTAACAA TGCGCCACTT AGAATATAGG AAACAGGAGC 
209001 TGCTACACAC AAGAAAGCTA TCAAAGCTCC GCAGAGTAAG ATGGCACTGA 
209051 TAACAATATG AATAGTGGTT GAATTCTTTA ATTCAAAATA ATAATTACAA 
209101 GAGCGATTAT TTTGAATAAC TGGCGAGGTT ATATTACTCA TACATTTCCT 
209151 AGGCTCATCA ATTTGCGTAT TATATCGTAA ATTTAACAAA AATCCTATTA 
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209201 ATAGAAATGT TTTTATTTTT AAAATTTTTA TTGAAATTGT TTTGTATTAT 

209251 TGATGAAAAG TTTTCATTGA AGATAGCTTC TAGAAGAAGA ATAACGAGAG 

209301 AGCTTGTTAT GCTTTTTTTA AAAAAAGGGC GAGCCCCTTT TTGTTCTTTT 

209351 CCCGATTGGG AATAACGTAT TCTTTTTATA AAAATTAAAG ATTTTTTTAT 

209401 ATTTCTAAAT CTCAATTATA AAAAGTCTTT CAAACGCAAT TTGCTTGTAG 

2094 51 GTATTGCAAA TGATCAGACT CCCCTTATAG GGAAGATCGT ATAGGGAGAC 

209501 TGCTTTCTAT TTTTTTCTCT AGTTTGTAGG TTTGGGCGTG GAAAGCATAC 

209551 TGAGGTAAGC TTGATGTCCA TTGTGAACTC GTATTACGAA CAAATTGCCA 

209601 TCGTTTTAAC AGATTTTGAT GGTGCGTTGT ATTTACGCAT ATCTTCTGAA 

209 651 CAAGAGCACT CGCTGGGTGC ATAGGACTTT CTTTTACTTT CTCTAGTAAA 

209701 ACCTTCAATT CTTTCCACGT CATGAAGTTC ACTGTAGGTT CATAGTTAGA 

209751 GG AT ACTGGA TCGAACATAT TTGTTTCAGT ATTCAAAAAG CCTCCAAATG 

209801 TTGCCATGGA ACAGTGGCCT CCTGCTTTAT CAAACTGACA CAACATTTTC 

209851 CAATTATCAG GATTTATAAG TTGAATCATC TGAGCCTGTT CCCAAGTGAT 

209901 ACCATGAGAA AAGAAAAGAA ATAAGAATTG TGAGATTCCT TGAACATCCT 

209951 TCCGGAAAAT CATATTGTGG GGGAGTTCCT TGAATATTTC TTCGCAGGTT 

210001 TTTTTCACAG AAGGAGAATC CCATTGATTC TTAGACGCTT TACTATATAG 

210051 GAACTTATAC TGTGATTCTG AGATTAATGT TAGTAGAGGG CGTGTATAAG 

210101 AGTGGAAAAT AGTTGTATAT CCTTTATGAA ACGCTAAGGG CCCAAGTAAA 

210151 CCATAAACTT TTTGTGTTTT ATTTAATCCG ATTTCCCCAG CAACAGATTC 

210201 AGTTTTGTCT ATAAAATGGG AGAGCCAGTA TAACGGGCAG TTTTGAAGAA 

210251 GAATCTCTTC GAACTCTGGA AACAGGGTTA AATCTATAGA TTTTAGAATA 

2103 01 TCGATCCCGA AAGCCTCTGC TTTTTTATGT AAATTCGGAG GCAGGTCTGT 

2103 51 ACCGCTTTTC CAGCAATTAA TAAATATTTT TAATTCGTTA ACAGTCAGGG 

210401 AGTGTGTTTT CACGAAATAA AAGACTTCTT TAGGATAGCG ATTGTAAATA 

210451 ATCTTCTGAA GTTCGTTGGG GATGGGCAGA ACCTTTGATT TAGCTAGCAG 

210501 TGCTACGACT AGCGTTATAA TCAAGATTAC GATAGCGGCT AAAGCTAAAG 
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210551 


TTCCTCCAAT 


AGCATAGCTA 


ATCGGCGCAG 


CAAGGAAAAC 


AAAAGAAACj i 


210601 


GCGCTAACAA 


GAGCTAGAAC 


AAGCCCAAGA 


ATGAG 1 UtaOO 


UAAl L.U 1 


210651 


AATTTTTAAA 


GAACAAGATC 




UL. Avj I CA ill 


TT'T A A A A AT A 


210701 


TATGGGGAAC 


TAGTGTTAAA 


(jVjALi ALAL I 


i L. A I Ao I AC A 


APP AP A ATPT 


210751 


CTCACGGTTA 


TTATAACCTT 


TTGATTGAAA 


AAA^jiUtj i At-(j 


A ATTPPTP A A 
AA L i i LiAA 


210801 


ATGGGAGTTA 




TCCCTACCCA 


L 1 AAAAAUL-A 


PPT APPP ATP 


210851 


AACAAGAGTA 


AGAGAAGCAA 


CTCTATGAAG 


AA o V- A Vj Lt A 


TP A ATPTTPT 


210901 


TGAGCCACTT 


CTTGTTCTTT 


AAGAGCAGAC 


J. I AAvjA 


ATA PTTTPTT 
A i Ao I 1 i o 1 1 


210951 


TAACTTAGTT 


GCAGAAACCA 


ACC AAA 1 ACjL 


AA i o A 1 Vj AA-rt 


ACA AHA ATPA 


211001 


CTGCAAGATA 


AGGGGTCATA 


GCTCCAATAC 


1 ICv-AV-ALjAI 


A APP APP A A A 
AA L A Vj L /VArt 


,211051 


CCTTGTTGGA 


TTAAAGCTCC 


TCCTGATTTT 


C LCj AAUU (j L» O 


P APP A APT AP 
L AUL AAL. 1 AL 


211101 


ATCAATAGCA 


GCCTTACCTT 


TGACTTTTTG 


CTCTTGGTCA 


ACj AbCjbA 1 A I 


211151 


AGGCCATTTC 


TTTAGTTGAG 


TCAAAGAGAG 


CGTATT i lb 1 


/^/^ A T^TiT^PP A A 

(j(jAl J. i. LoAA 


211201 


AGAATATTCT 


GTATAGCTCC 


GACAACCACA 


GCTACjL A i (jA 


P A PP A PTTPT 


211251 , 


ACCGAACATA 


GCGACCAGCC 


^ TV TV TV ^^/^fnm/^ 

CAGAAGCTTG 


/^rprprpO'P A A AP 
Cj i I 1 L i AAAo 


ATAAPAAPAP 
A 1 AAC Anvj-rtO 


211301 


CGAAGAAAAC 


GATACCTGTT 


TV TV Tv Tv ^> /** TV 

AGGAGAACCA 


I GAC AbvaAU i 


P A PT APPPPT 


211351 


CCAGTTAACC 


ATCCAAATTT 


ACGAATGACG 


i lACLAL-UAA 


PA AATAPPAT 
CAAA 1 AoV-rt 1 


211401 


GATAAGTACG 


GATACTACGC 


TV rr^^^ Tv TV TV 

CAGTCCAGAA 


GGAGAAGl XL. 


PPP A TP A A PT 
LLC A 1 oAAL i 


211451 


CACTATAGTC 


ATTCATATTA 


^ TV m TV m m Tv 

GGATATTGCA 


GTTTCAGCT O 


A PTTTTPP A A 

Av- 1 1 I i L.LAA 


211501 


GTCACTTCGA 


TTAAGTTAAT 


^ TV TV TV m TV ^> TV 

GCAAATACCA 


TAGGCAATAA 


PPAAPAPAPP 
V-UAAbAoAOL 


211551 


TAATAAAAGA 


ATATAAGGAG 


ATCTAGCAAG 


A«T»A/^A/^PA AP 

A i AvjAtjb AAU 


PT ATPTTTP A 


211601 


TATTCATTTT 


AGGTTTAGCA 


CClTTTi 1 UL. 


pi mmrp rp ^ rp 
Q-C i i i IbUAl 


TTPTTPTnnA 


211651 


TTATAGAAGC 


GAGGATCGGT 


CAATACGTTC 


i i Ai i (jAU^U 


APP Af^TA APT 


211701 


GGCCATAAGA 


ACAAGTCCAG 


ATAL AA TAG I 


LAI AoUCAl L 


A A A Af^APnTA 


211751 


AAGAAATTCC 


CCAAGGATCT 


ACACCTTCAG 


AAACGGAAGC 


TCTCAACTTT 


211801 


GAAGCCCAAA 


CAATTGCACG 


ACCAGAAGCT 


AGTAAAGAAA 


TATTAGCTCC 


211851 


GATACCGAAA 


AGAGCGTAGA 


AACGCTTTGC 


TTCGTGGATT 


TTTGTAATTT 
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211901 
211951 
212001 
212051 
212101 
212151 
212201 
212251 
212301 
212351 
212401 
212451 
212501 
212551 
212601 
212651 
212701 
212751 
212801 
212851 
212901 
212951 
213001 
213051 
213101 
213151 
213201 



CATTAGCAAA 
TCAGCAAGTA 
GAGTCCTAGC 
CTGTAGGATG" 
GCAAAGAAAA 
ACTTAAAATA 
AGGGGACAAC 
CCAGGAGCTC 
GTTAAATGTA 
GCTCGTGAGT 
TTTTCTTCGG 
TTCTGATAGT 
CGAAAGACCT 
GGATACTCCT 
CTGTAAAATT 
GGGAGAGCTT 
CGCTCACGAT 
GTTAACGGTT 
TTGTCTTGCC 
TAAAGTCACG 
ATTGTGCGTG 
AAGTTGTTCG 
TATAACCGCC 
TTTCCGCAGA 
TTTTATATGT 
GCTTACATAT 
AGAGAAATAA 



TCCCCAGAAC 
CATAAAATGC 
AATCCTGGAG 
TAAAACATCG 
TTAAAAAGGG 
TTACTTAGCT 
AAGCCAAAAC 
CCACAATAAG 
ATACAGAAGA 
ATGTATCGGC 
TTTTTGTCAT 
TTTTTATTTC 
TATAATATAC 
AAGGGATAAG 
TATTTTTTAA 
ACAGCCTTCA 
ACCGTTGCGA 
TTTTAGGGCT 
AGTTCTAAAA 
ACGTTTTTTA 
GGGTTTCCGA 
GGAGCAGCGC 
GCTAGGGCCT 
CTCTTTTAAT 
GATTGTTCTA 
TTTACAAGAA 
AATTAAAGCG 



ATTAGAGATA 
AGCAAATGTC 
GTAGGATGGC 
CGTAGCGGAT 
CGTTCCCACT 
TTGCATAAAT 
TTGATGAAAG 
AGTGTCTTTT 
ACATTAGGAA 
CACAAGAAAG 
ATTTACCCTC 
GTTGTGGACT 
CATCTTTTCT 
AGTTCTTAGG 
TTAAACGAAA 
CAGCTTCGAC 
ATCGAACGAA 
AAAGCGAGCG 
CGCGTTCAAT 
GAGCCTCTAG 
TTCTAAAATA 
TTTTCATCTT 
GGAAGATAGC 
AGCTTTTCCG 
CGGCCATGAA 
AGAATGTGGC 
CTGTTTTAAT 
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GCATGACGCT 
CAGTTTCTTA 
CTGTAAACGG 
AAATTACAGT 
GCATAAAATA 
AAGCATAAAG 
GTATTGCCTC 
GTATCGCGTA 
CATTGGCAGA 
AGCGCAATTT 
TGAAATACTT 
AACTTGAATT 
CTCTATTGAC 
CGACTTTATT 
GGAGAGGTAA 
GTAGGCATTC 
TTAACTCCCT 
AGGAGGTCTT 
ATCTGTTTTA 
GAGCTCTAGG 
AATGTTTTTA 
TTTTAGAGTG 
GACATAAATC 
ATCAGTTTTT 
ACCCACCTTT 
ATACTTTCAA 
CAATCAGCGA 



TCCCCATAGT 
AGATGGCAAC 
TCAGCAAATT 
CGGGAACAGG 
AGGCCTGCTT 
ATAATAGCAC 
TGCACCAGAA 
ACACCGTATA 
ACTTTCTTTA 
TCCAAAAGGT 
TTATTTTCTA 
GTTTCCATTT 
AAGAGGAAAG 
GCTAAGTCGC 
CTTCAAGGTT 
CATAGCTCTA 
TTTTAAAGAA 
TGTCTCCAAC 
GTAAAGTTAA 
CTTAGGATTA 
ACATTTTTAA 
AAATGATGCA 
GTTTTCTTTG 
CTATTTCTTC 
TATAATTGAT 
TTTAATGTTT 
ACATAAATTA 
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2132 51 TATTGCAAGT TCTCCTTTTT 

213 301 TTCTTAACAA AATTCAAAAC 

213 3 51 CTATAACATT TTGTTTTCTT 

213401 AAACTATCCT GAACAAAGAT 

2134 51 TGCTGGATAA AATATATTTT 

213 501 TGGTTTTTAA TGGATCTTAA 

213 551 TGGGATATAT ATTTAAAGTG 

213 601 GGTATAACTT TTGGATGTAC 

213 651 TCCTTGTATA CTATCCATGA 

2137 01 TCGTGGGGAA TAGGCTTGCT 

2137 51 CCTCATGCGT ATGAGATGGT 

213 801 TGCCGTAATT TTTTGTAACG 

213 851 GGAAGCATTT AGAAAATAAT 
213901 ATAGCGCGTG GGGCCTTTGT 
213951 TCATATCTGG ATGGATCTTT 
214001 CAGAAGTTCT CATTGAAAAG 

214 051 AATAGTGAGG AACTTGTTTG 
214101 ACAATGCTTG AGCACAATTC 
214151 ATAATGCGTT CAGTTACTTT 

2142 01 GTGGCTTCCG GAGCATGGAG 
214251 TCCAGAAGCT CAAATCAGTG 
214301 TTAATGAGCA TGATGTCAGT 

2143 51 GATGCGTTGA AAAAAATTGT 

2144 01 TCTAGCTCAA AAACCATTGT 
2144 51 GCACCTTTAA ACATAATGTC 
214 501 GCTCTTGAAT GTCAAAGATG 
214 551 TAAACTATGA GCATGCAGCC 



GCAATATTTT TCCTCTCAGG ATCTTTTTGT 
ATAAAAATCG TAAAGTAAAG AAGATTTTTT 
ATAAAACAAA CGGGTTTTCT AATTTTTTTA 
ATTTCTTTCA TTTCTATTAG TTGTATTTCT 
GAATAGAACT TGTTTTTCTG GTACTTTAAG 
AAAATGCTTC TAGAGAGATG GATGCGAAAA 
ATGCGTTGGA TTTTCTGTTT CGTGGCATGT 
CAATTCTGGG TTTCAGAATG CAAATTCACG 
ATCGCATGAT TCATGATTGT GTTGAAAGAG 
ACCGCTGTTT TGATCAAAGG ATCCTTAGAC 
TAAAGGGGAT AAGGACAAGA TTGCTGGAAG 
GCCTGGGTCT TGAGCATACA TTAAGTTTGC 
CCCAATAGTG TCAAGTTAGG GGAGCGGTTG 
TCCTCTAGAA GAAGACGGTA TTTGCGATCC 
CTATTTGGAA GGAAGCTGTC ATAGAAATTA 
TTCCCTGAAT GGTCTGCTGA ATTTAAAGCA 
TGAAATGTCT ATTTTAGATT CTTGGGCGAA 
CTGAAAATTT ACGGTATCTT GTCTCAGGTC 
ACACGTCGCT ATTTAGCTAC TCCTGAAGAA 
GTCTCGTTGT ATTTCTCCTG AGGGTCTATC 
TTCGTGATAT TATGGCGGTT GTAGATTATA 
GTGGTTTTCC CTGAGGATAC TCTGAACCAA 
TTCTTCTCTG AAGAAAAGTC ATTTAGTTCG 
ATAGTGATAA TGTGGACGAC AATTATTTTA 
TGCCTTATCA CAGAAGAATT AGGAGGGGTG 
AGACTTTTTG GTCTGTACAC AACCTTTGTG 
GTTCTTTATC ACATATCCTT TTCCTTGGGA 
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214 601 AAGGGGTCAT TAACTGCTAT TTTAGGTCCT AATGGAGCTG GTAAAAGCAC 

214 651 TCTCTTAAAG GCTTCCTTAG GCCTGATCAA ACCCTCTTCG GGGACTGTTT 

2147 01 ATTTTTTTAA TCAAAAATTT AAGAAGGTGC GTCAGCGCAT AGCCTATATG 

2147 51 CCTCAGAGAG CTAGCGTGGA TTGGGATTTT CCAATGACTG TCTTAGATTT 

214801 AGCCCTTATG GGGTGTTACA GCTATAAAGG AATGTGGGGG AGAATTTCTT 

214851 CGGATGATCG AAGGGAGGCC TTTCATATTT TAGAAAGAGT TGGTTTGGAA 

214 901 TCCGTAGCAG ATAGACAAAT AGGACAGCTC TCAGGAGGAC AGCAACAAAG 

214 951 AGCATTTTTA GCACGTGCTT TGATGCAAAA AGCAGATCTA TATCTTATGG 

215001 ATGAGTTGTT TTCAGCGATT GATATGGCTT CGTTTAAAAC ATCTGTAGGG 

215051 GTTTTGCAAG AGCTGCGAGA TCAGGGAAAG ACTATCGTCG TTGTTCATCA 

215101 TGACTTGAGT CATGTGCGTC AACTATTTGA TCATGTGGTT TTATTGAATA 

215151 AGCGTTTGAT TTGTTGTGGC CCTACTGATG AATGTCTGAA TGGAGACACT 

215201 ATTTTCCAAA CGTATGGTTG TGAAATTGAA CTTTTGGAAC AAACCCTGAA 

215251 GCTCTCTCGA GGAAAACAAT TTGGATCGTG CTGATTATGC TCAGTTGTGT 

215301 TTTTTCTGAT ACGATTTTCT TATCTAGTTT TTTAGCTGTC ACTTTGATTT 

215351 GTATGACCAC AGCTTTGTGG GGGACAATTC TCTTGATTAG CAAGCAGCCT 

215401 CTTTTAAGCG AAAGTTTATC TCACGCGTCG TATCCAGGAC TTCTAGTTGG 

215451 AGCTTTGATG GCGCAATATG TTTTCTCATT GCAAGCTTCT ATTTTTTGGA 

215501 TTGTGTTGTT TGGGTGTGCT GCTTCGGTAT TTGGTTATGG GATCATTGTT 

215551 TTCTTAGGGA AAGTATGTAA ATTACATAAA GACTCCGCCC TTTGTTTTGT 

215601 TCTTGTGGTA TTCTTTGCTA TCGGAGTGAT TTTAGCCAGT TATGTCAAGG 

215651 AAAGTAGCCC TACGCTATAC AATCGCATTA ACGCCTATCT ATATGGGCAA 

215701 GCAGCCACTT TAGGTTTTCT TGAAGCTACG TTGGCTGCGA TCGTCTTTTG 

215751 TGCTTCGTTA TTTGCTTTAT GGTGGTGGTA TCGACAAATT GTTGTGACTA 

215801 CTTTTGATAA AGATTTTGCT GTTACTTGTG GCTTAAAGAC TGTTCTTTAT 

215851 GAAGCACTCA GTCTAATTTT TATATCGTTG GTGATCGTAA GTGGAGTTCG 

215901 AAGCGTAGGG ATTGTTTTAA TTTCTGCTAT GTTTGTGGCT CCTTCTTTAG 
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215951 GTGCTCGTCA GCTTTCCGAT CGTCTAAGTA CAATTCTTAT CCTTTCTGCA 

216001 TTCTTTGGAG GGATTAGCGG AGCTTTAGGA AGCTATATCT CTGTAGCATT 

216051 CACATGTCGT GCTATTATAG GGCAACAGGC GGTGCCTGTA ACCTTGCCTA 

216101 CGGGACCTTT GGTTGTCATT TGTGCTGGAT TATTGGCCGG TCTATGTTTG 

216151 CTTTTTTCTC CAAAATCTGG GTGGGTCATT CGTTTTGTCC GTAGGAAGCA 

216201 CTTTTCGTTT TCAAAGGATC AAGAACACCT TTTAAAGGTG TTTTGGCATA 

216251 TTTCTCATAA TCGTTTAGAG AACATTAGTG TTCGAGATTT TGTCTGTAGT 

216301 TATAAGTATC AGGAGTATTT TGGGCCTAAG CCTTTCCCTA GATGGAGAGT 

2163 51 TCAGATTTTA GAATGGCGGG GTTATGTTAA AAAAGAACAA GATTATTATC 

216401 GACTCACAAA AAAAGGAAGA AGTGAGGCCT TAAGATTAGT TCGTGCTCAC 

216451 AGATTATGGG AATCGTATCT TGTGAATTCT TTAGATTTTA GCAAGGAAAG 

216501 TGTTCATGAG TTGGCTGAGG AAATAGAGCA TGTTCTTACT GAAGAATTGG 

216551 ATCATACCTT GACAGAGATT CTCAATGATC CTTGTTATGA TCCTCATCGA 

216601 CAAATTATCC CAAATAAAAA AAAGGAAGTC TAATGGCTTT GGGACCTTCT 

216651 CCTTATTATG GAGTATCTTT TTTCCAATTT TTTTCAGTAT TTTTTTCGAG 

216701 ACTGTTTTCT GGAAGTCTTT TCACGGGTTC TCTCTATATT GATGATATTC 

216751 AGATTATAGT ATTCCTTGCT ATTTCCTGTT CAGGTGCTTT TGCAGGAACT 

216801 TTTTTAGTCT TGCGAAAGAT GGCTATGTAT GCGAATGCTG TCTCTCATAC 

216851 TGTCCTTTTT GGTTTGGTCT GTGTTTGTTT GTTTACGCAT CAACTGACGA 

216901 CCCTCTCTTT GGGTACCTTG ACTCTTGCAG CAATGGCAAC AGCTATGCTG 

216951 ACAGGGTTTC TTATTTACTT TATTCGTAAT ACTTTTAAAG TTTCAGAAGA 

217001 GAGCAGCACC GCTCTAGTCT TTTCTTTATT ATTCTCTCTG AGCCTTGTTT 

217051 TGTTAGTCTT TATGACAAAG AATGCTCATA TAGGAACGGA GCTTGTGTTA 

217101 GGAAACGCAG ATTCTTTAAC GAAAGAGGAT ATTTTCCCTG TCACTATTGT 

217151 GATTTTGGCT AATGCTGTAA TTACTATTTT TGCGTTCCGT AGCTTAGTTT 

217201 GTTCTTCTTT CGATTCTGTA TTTGCCfCTT CTTTAGGAAT TCCTATTCGG 

217251 TTGGTTGATT ATTTGATTAT TTTTCAACTT TCTGCATGTC TTGTAGGAGC 
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217301 TTTTAAGGCT GTAGGTGTAT TAATGGCACT TGCTTTTCTG ATCATTCCAT 

217351 CGCTTATTGC TAAGGTTATT GCAAAATCGA TAAGGAGTCT TATGGCTTGG 

217401 TCGTTAGTTT TTAGTATTGG TACAGCATTT TTAGCTCCTG CATCTTCGAG 

217451 AGCAATTCTT AGTGCTTATG ATTTGGGGTT ATCGACTTCG GGAATCTCTG 

217 501 TAGTGTTCTT GACGATGATG TACATCGTGG TTAAATTTAT AAGCTATTTT 

217 551 CGAGGCTATT TTTCTAAAAA TTTTGAAAAA ATAAGTGAGA AAAGTTCTCA 

217 601 ATATTAGCAG TGATTTAAAA GAATGAAATT TTAGAGTTTA GTCCACTTTA 

217 651 GTTAGAAATA GTATCAATAG AGAATTGACA ATTCCTCGAC TTGCGGAGTA 

217701 TGATTCTCTT TTTCACCTAG TTAAAGGTAG CATGCTTGAA ACATTTAGCC 

217751 GTTCTTGGGT CAACAGGTAG' TATTGGCCGT CAAACATTAG AGATTGTGCG 

217 801 GCGCTATCCT TCAGAATTTA AAATTATTTC TATGGCTTCT TATGGAAATA 

217 851 ATCTAAGGTT ATTTTTTCAG CAACTAGAGG AGTTTGCTCC GTTAGCCGCA 

217 901 GCGGTCTATA ACGAAGAGGT TTATAACGAG GCCTGTCAGC GATTCCCCCA 

217 951 TATGCAATTT TTCCTAGGCC AGGAGGGTTT AACCCAACTT TGTATCATGG 

218001 ATACAGTCAC TACTGTCGTT GCTGCTTCTT CAGGAATCGA GGCGCTACCC 

218051 GCGATTCTAG AGTCGATGAA AAAAGGAAAA GCACTAGCTT TAGCAAACAA 

218101 AGAAATTTTA GTTTGTGCTG GCGAATTGGT TTCTAAGACT GCAAAGGAAA 

218151 ATGGTATAAA AGTTCTTCCT ATTGATAGCG AGCATAATGC TTTGTATCAA 

218201 TGTTTAGAAG GCAGGACGAT TGAGGGAATC AAGAAACTGA TTCTTACAGC 

218251 TTCTGGAGGG CCTCTGCTCA ACAAGTCTTT AGAAGAGCTT TCTTGTGTAA 

218301 CAAAACAAGA TGTTTTGAAC CATCCTATAT GGAATATGGG TTCAAAAGTG 

218351 ACTGTGGACT CATCCACATT GGTCAATAAG GGACTCGAAA TTATCGAGGC 

218401 GTATTGGCTG TTTGGTTTAG AAAATGTTGA AATCCTGGCT GTAATTCATC 

218451 CTCAGAGCTT AATCCATGGT ATGGTAGAGT TTTTAGATGG GAGTGTGATT 

218501 TCTATCATGA ATCCGCCTGA TATGCTCTTC CCAATACAAT ACGCTTTAAC 

218551 AGCTCCAGAG CGTTTTGCAT CTCCTAGGGA TGGTATGGAT TTTTCGAAGA 

218601 AACAAACTTT AGAATTTTTT CCGGTAGATG AGGAGCGATT TCCTAGTATC 
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218651 CGTTTAGCAC AACAGGTATT AGAGAAACAG GGGTCTTCTG GAAGCTTTTT 

218701 TAATGCAGCC AATGAAGTAT TAGTGCGGAG GTTCCTTTGC GAAGAGATTT 

218751 CTTGGTGTGA CATTTTACGC AAATTAACGA CTCTTATGGA ATGTCATAAG 

218801 GTTTATGCCT GCCACTCTTT AGAAGATATT TTAGAAGTAG ATGGTGAGGC 

218851 TAGAGCTCTT GCTCAAGAAA TATAATCGAG TAGGTATATG ACAATAATCT 

218901 ATTTTATTCT AGCAGCCCTA GCTTTAGGGA TTTTAGTGTT AATTCATGAA 

218951 CTTGGTCATC TGGTAGTAGC AAAAGCTGTA GGAATGGCTG TAGAGAGTTT 

219001 TAGCATAGGC TTTGGTCCTG CTTTATTTAA AAAGCGTATA GGCGGCATAG 

219051 AATATCGCAT TGGATGCATT CCTTTTGGAG GCTATGTTCG TATCAGAGGT . 

219101 ATGGAACGTA CCAAAGAAAA AGGGGAGAAG GGGAAGATAG ACTCTGTCTA 

219151 TGATATTCCT CAGGGATTTT TTAGTAAGTC TCCTTGGAAA CGCATTCTGG 

219201 TTCTTGTTGC TGGTCCTCTT GCCAATATTT TATTAGCTGT CTTGGCTTTC 

219251 AGCATTCTTT ACATGAATGG GGGAAGAAGT AAAAATTATA GCGACTGTTC 

219301 TAAAGTGGTA GGTTGGGTCC ATCCTGTTTT ACAGGCAGAA GGATTGCTCC 

2193 51 CTGGAGACGA GATTCTTACG TGTAATGGTA AGCCTTATGT GGGAGATAAG 

2194 01 GACATGCTAA CAACCTCTTT ATTAGAGGGG CATCTCAATC TAGAAATCAA 
2194 51 ACGTCCTGGC TATTTGACAG TTCCTAGCAA AGAGTTCGCT ATTGATGTTG 
219 501 AGTTTGATCC CACAAAATTC GGGGTTCCCT GTTCTGGAGC GAGTTATCTT 
219 551 TTGTATAGCA ACCAGGTGCC CCTAACGAAG AACTCTCCTA TGGAGAATTC 
219601 AGAGCTACGT CCGAATGATC GTTTCGTTTG GATGGATGGC ACACTTCTTT 
219651 TCTCAATGGC TCAGATATCT CAGATACTCA ATGAGTCTTA TGCTTTTGTG 
219701 AAAGTAGCAC GGAATGACAA AATCTTCTTT TCTCGTCAAC CTAGGGTATT 
219751 GGCTTCCGTT TTACATTACA CTCCCTACCT TCGTAATGAG CTTATAGATA 
219801 CGCAGTATGA GGCTGGACTT AAAGGCAAGT GGTCTTCGTT ATATACATTG 
219851 CCTTATGTAA TCAATAGTTA TGGATACATA GAAGGTGAAC TTACTGCTAT 
219901 AGATCCAGAG TCTCCTTTGC CACAACCTCA AGAGAGGCTA CAGCTTGGGG 
219951 ATCGCATTCT AGCTATTGAT GGAACTCCTG TTTCTGGAAG TGTAGATATT 
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220001 TTACGTCTTG TTCAGAACCA 

220051 TCCGCAAGAA CTTGAAGAGG 

220101 TCGCCTCTTA TCATTCCGAA 

220151 GAGTCTCACC CAGTAGAAGT 

220201 TCAGCCTCGT CCTTGGATTG 

220251 AGTTGGAAGT AGCTAAGAAG 

22 0301 TTGGAGCGTC TTGATGCTGA 

2203 51 GAAAGATCTT AAGGTGAGGT 

2204 01 ATATTACTAA GGAAAGTTTG 
220451 CTGAGTCCAC AATGGCTTTC 
220501 TACAGGATGG TCGGTAGGGT 
220551 TTAGTATGAA TTTGGCTGTC 
22 0601 GGAGGTTATA TCCTTCTATG 
220 651 GAATATGAAG ATTGTGGAAA 
2207 01 TCATCTTCTT TATTTTTCTA 
220751 TAAGGCTCCA TTTTTTTGGA 
220801 AAGCAGTGTA TAACGTAGAT 
220851 GTTCTTTTGT TTGTCTTCCA 
220901 CGTGCTCGAT TATCGGGGTT 
220951 AGATAGAGCG ACTTCCGATT 
221001 CACGTAAGCT TTATGTGGAG 
221051 . TACCATCGTT TTCATAGGGC 
221101 TTCCCACTAA GAATGAGAAC 
221151 TAACATCTTT GTGGGGCAGT 
221201 CAAATACGAA CTGTACGAAT 
221251 GACCCTGCGT TCGGAGGAGC 
2213 01 TCGGGGCAAT CAGGGTGAGC 



TCGGGTCTCT ATTATTGTTC AGCAGATGAG 
TGAATTCTCG AGATGCTGAT AAGCGGTTTA 
GATCTGTTAC AAATTTTGAA CCATTTAGGA 
CGCGGGTCCT TATCGTCTTC TTGACCCTGT 
ATGTTTATTC TTCGGAGAGT TTGGATAAAC 
ATTAAGAACA AGGATAAACA AAGATACTAT 
GAAGCAAAAA CCATCTTTAG GGATTTCTTT 
ATAATCCTTC ACCTGTGGTT ATGTTATCAA 
ATCACCTTGA AAGCTTTAGT TACTGGACAT 
AGGACCTGTG GGTATTGTGC AGGTTTTACA 
TTTCTGAAGT GCTCTTTTGG ATCGGTCTAA 
TTGAATTTGC TTCCTATTCC TGTTTTGGAT 
TTTGTGGGAG ATAGTCAAAA GAAGACGTTT 
GGATTTTGGT TCCGTTCACT TTTTTATTGA 
ACTTTTCAGG ATTTATTTCG TTTTTTTGGT 
AATCCGTAAA GGTTGTAAGG ACGAGAACCA 
GTCTTGGTTA GGGTAAGGCT TAAGTCTTGA 
TAACAGTTTG TTCTAAAGCT GCTTCGGGAA 
GTATTTCCTT CTTTTAAAAA CTCTTTCATA 
AGTAATGATG TAGGTATGAC TCGTGTGGAT 
TATCTAGCTT TGTCTCTATA GTGCATACCT 
AGGCGGAAAG GAAGGAATTT GCTATGTCTG 
GTCTAAAGAC GAAGGGAGTT TCCCGACATT 
AGGTAAGAAG CTGTCGTCCT AATTTTCCTC 
TTCACACAGC GGTAGCGTCT AGCTAGTGTA 
CATGAGAATA GCTTTTCTTT TTTGGCTTCT 
GAGTGCTACA CGAACAATGA CTCCTCCAAT 
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221351 
221401 
221451 
221501 
221551 
221601 
221651 
221701 
221751 
221801 
221851 
221901 
221951 
222001 
222051 
222101 
222151 
222201 
222251 
222301 
222351 
222401 
222451 
222501 
222551 
222601 
222651 



AGAATGAGTT 
TTTTCAGCAA 
CGCGTCTCAT 
GCGAGCAATA 
AGACCACGGA 
GATGGAAGGG 
AGCTAAAAAG 
ATTATTTTAA 
ATTTATAAAA 
GTTATTGTTT 
AAACAAGGCT 
AGAGTCTGAC 
ATTATCTAAT 
GATGAGATTG 
AGACTGTGTT 
CATAGTGAGT 
ATTCGAGATC 
GCAACAGCAG 
TTTTAAGGCC 
AATCTGAAAG 
CCGTGTTTGA 
CTTTTTAAGA 
GCAGAGGGTA 
GGCGACGATC 
AAAAGCACAA 
AAGTTGGTTA 
GAGCTTGGGG 



ACGAAGTTTA 
GCGATTGAGA 
AATTCCAAAT 
GGTTTTAAAG 
TTCTTTTTGT 
TTTGGATTAC 
AGTATAGTTA 
ATAAGTTTTT 
ATAACTAATT 
TAGATAATTT 
TGTTTTTGGA 
CCAGAGTTGG 
CCAGCATGGA 
CTTTAGATAT 
TCTGCCCTTC 
AGAAAGTCTT 
TCTAGGAAGT 
TTTCAGAAAT 
AATTGTTCTT 
TTTCTGACTA 
CCAACTGTTC 
GAGCATTTCT 
TAGTGGTTAT 
CGCAGGAGCT 
TAGGTACTTT 
TAGCAGATTT 
AAGGTGGTCT 



TAGGGACTCC 
TGTTCAGCAT 
AAAGACATCG 
ATGTATAAGA 
TTTGAGGTTT 
CGAGGTTTCC 
ATAAAAATTT 
TATTAAAAAA 
CGCTTACTTT 
GCTCAGAAAC 
AGTTCCCCAT 
GGCAGGGTCA 
TATCATCTAG 
AGGCACTCAG 
ACTAGAGAAT 
CGCGATGAGG 
GATATAGAGA 
ATCAGAATTT 
TTAGGTTGTT 
CAGAGAAACC 
ATCCCAGAGG 
CTGCTGAAGA 
CGCATTGAGA 
CCTGAAATTA 
CCCTATCAGC 
TTTTTCCTTG 
TTCTCAAACT 
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AGGCTTAAGT 
GCTTTTCTAG 
TAATGTTCTT 
TCTTAAAAAC 
CCTTTAATCC 
GAGAATAAAG 
CTTCATGTGT 
GAATATGTTT 
TAACGCAAAA 
TTGAGCGTTC 
GCATATGGGT 
AGGAGTTGAC 
ACAGACTAGA 
CAAGCCTTAA 
TGAGACACAG 
GCCAACGGAA 
GCTGTTTATG 
TTAATTAAGG 
AGACCAAAGT 
GTTGGATGGA 
CCACGGTTGA 
GCGCGATGAT 
TAAAAGCAGA 
GAAGGCGGTC 
TGCGATAAGG 
CTTGTCTGTA 
GTGTTTCTAA 



TCAGCTATTT 
AGTAAACTTG 
TTTCTAGAAC 
GCATGCACGC 
CCCAATTCCA 
GATTACCCAC 
CACCGATATA 
ATTAAAACTT 
GAAAAATGAT 
TCGATACTTA 
GGAAGTAATC 
CGACACGTTC 
GGGGAGACGT 
GATTGCCAAA 
GCATTTGGTT 
GTGCTTCCTC 
AAATTCTTCG 
AACTTTTAAA 
TCTTTGGATA 
TAGGTAGGTG 
GGTTTGCTTG 
AGTACGATAA 
TTTAGGAAAA 
TTTTGAAGAG 
TTTTTATAGG 
TAGATGGAGA 
GAAGAAATGG 
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222701 GAAGATCCGA AGGTGATGGT ATCTGTGAGA TGTTGCGTGC GAAAAGACCT 

222751 TCCCAAGGAC AAAACATAAA GCGCTTCTAG GAGGTTTGTT TTTCCTTGGG 

222801 CATAATTGAG TTTAGGAGCC AGTGAGATTT CTAAATCACT GTGGTTACGA 

222851 AAATTTTTTA GCTTCAGAGA GCAGATTTTC ATAAACATCG GCAGGGTAGT 

222901 GACCTAAAGG ATTCTCCTTA GGGAGTTTAT TAATCATCAT GTAGCCTCAT 

222951 AGGCATGATG ACAAATAATC CTGAGGCAGA ATCGGTAATG ATTCCAGGAT 

223001 TATAGGAATC CG AG ATCCCT AAGCTGACTA ATTCATCCTT ACTATGCTTC 

22 3051 AGGATATCTA AAAAGAAAAA GGGATTAAAG GCAATTTCTA GGAGTTCGCC 

223101 AGAATAATTT ACAGCCATGC TTACCTTTCC TTCACCCACC TTAGTACAGT 

223151 TGGCTGTTAG AGTGAGCTCT CCGGGTAAGA AAGAAAACTT CACGGAGTGA 

223201 GAGGACTCAT TTGTAAATAA AGCCACTTGT TTGAGCAGAG TAATTAGTTC 

223251 TTCGCGATGC AGATCGAGTT TTACGTTGCT TTCTGTAGAT ATGACGGGGG 

2233 01 AGAAATCTGG AAATTCTCCA GAAAGAAGTT TTGTGATCAG GAGAGTATTG 

223351 TCACATTCAA CCGCAATCTT ATCTTGATCC AAGAAGATCG CAGCTTCACC 

223401 TTCATCGGAG CACATCTTTA TAATTTCTTC TACTGCTTTG ATAGGAATAA 

223451 TATATTCCCC AGAAAAACTT TTATCTAAAG TAACTTCAGC ATCTATTTTT 

223 501 GCTAAACGCT TTCCGTCAGT CCCTACGATG GTAGCCACGC CATTGGCGAT 

223 551 AGCAAGCAGG ACTCCAGTAA GAACATAGCG GCTTTCTTCT CTAGATACAG 

223 601 CGAATGAAGT TCTCTGTAGC ATGGTTTTTA GCTGCTCTGC AGGCAAGGAA 

223 651 AAACGCAAAG CATTTTGTAT ATCAGGGAGC ATGGGGAAGT CTTCTTTTTC 

2237 01 CATGCTGAGT AGGCGAAAGC ATGAAGATCC CGAGGTGATT TGTGCCATTT 

2237 51 CCCCTGCTGA AGAGGAAATT TCTAAATTTG CCTCTGTTAA TTCTTTTACT 

223801 AATTGAAAAA ATCTCTTGGA GGGAATGGAA ATAGCGCCTT TCTCATAGAC 

223851 TTTAGCTTTG GTGACGCAAC GTGTGCTCAC TGTCAGATCC GTAGCAGTGA 

223901 AAACTAATTC AT.CATTATAA GTTTCAATCA AAACATGGGT GAGTACTGGA 

223951 ATAGGTGTGT TTTGAGGGAC GACACTTTGA ATTTTTTTGA TAAGGTTTCC 

224001 TAGCTCATTT CGGGATACAA CGAATTTCAT ATTTTCCTAT AACCTGAGTC 
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224451 


ATTAAACGCG 


AGTATTGCTC 
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TAT A APPATH 


224501 


AGGAGCGTCG 


TAAACGTAAA 


fT^ m rvy fT^ rr^ 


A TAP AT ATP A 
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224551 


TTAGAGGGTA 


AGATTGCTCA 


A A A r» A TT* 

AAA(jCjLjL A 1 L> 


A PTTTP A TTP 
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PTPTnnnAAT 


224601 


GTTTCTGAGT 


CGCGGCTATG 


TTAAGGTALG 


n^'PT'^PPTTP T 
ill i 1 L> 1 


TPTPPTPPP A 


224651 


AAAAAGCTTA 


TGATAAGCGT 


CGI ALbAlLA 


TAPA AAPAPA 
1 AUAAAvjriVjrt 


A A Ann A APnT 


224701 


GAAGTTGCCG 


CTGCTATGAA 


GAGGCbCCA i 


PA TT'P AT* AT A 
A i 1 oA 1 M 1 


PnTTAPrJATA 


224751 


TGGTGTTCTT 


CAGCCCACTG 


TTTTGL 1 1 L 1 
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PA A A APTP AT 


224801 


GAGGACTGTG 


GCAATAGCGT 


CGGCGTATGC 


0/^AO/^'T'/"PP A 

CjC AGC i, CboA 


TPP APTAPTn 


224851 


AAACACTTTG 


GATAGGATAG 


GAGC I i AbL 1 


r^T* APPPPTTT 


PPPTPTAmA 


224901 


GTATCAAGAA 


TATGGGTGTA 


AATTT. i ICC i 
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ATTTTTn A AT 
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224951 


ATGATTTCCA 


CTTGTTGCAA 
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ATPP ATATPT 
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A AP ATPflTAP 


225001 


CTGCTGCTTC 


AGAAAAAATA 


CGCCAAGGTC 


n^T'O^PP A PPP 


ATP ATPPPPT 


225051 


GACGTTTTGA 


TCTCTCCTCC 


CCACTCTACA 


nn A /^T'T'PTTPP 
i Ab i 1 Li i i Ub 


PAPA AAAPPT 


225101 


ATTGCAAATT 


TCATTTAGAC 


AATCTACGGC 


ATA A PPTTTP 
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APAAPAPPAP 


225151 


AGAGGTCGAT 'TTGAACATGA GGATTCTTTT 


TP ATT Af^Af^T 


TTTTGTGTTT 
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225201 


GACTGAAACT 


CCAAGTGTTG 


CCAGCCCATG 


TPTTTATA AT 


PTTGTTCCCA 


225251 


AACGTCTTTA 


GGGGGGAGGG 


TTTGACTTTT 


GAGATGTAGA 


AGCCATAGGG 


225301 


TTTTTAAAGG 


TCCTACAGTA 


GGGTCAAAAC 


GTCCTTCTGA 


AAGTTTGTAA 


225351 


AGTGTATCTA 


CCTGATCTAG 


AAACTCGGAA 


AGTTCTACAG 


ATAAAGTTAT 
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225401 GGGGACATCT GCTGGAGCTC 

225451 TCCAGTTGTT ATAAATCGAG 

225501 TGGGATAAAG ATGCTTTTTC 

225551 GCGATAGAAG ATTGTCATCT 

225601 AGCATGAACA GAGTCCAAGA 

225651 ATCGCCATGT ATTGCTCATG 

225701 CCAGGAACTG GAGTGATTGC 

22 5751 TACATCCCCA AGAAGAGTAT 

225801 TTGTTCCTAC ATCTACGATC 

22 5851 ATAAAAAGCG GTGCTCCTAG 

225901 GATTTCTGGG AGGTTTTCCG 

2259 51 TAGTTTGAGG ATGCTTTTGC 

2260 01 ATGTTGCTTC TCCCTACAAT 
226051 ATAATAGTTC AGGAGTTCAA 
226101 CATCAAAATT TCCAAGGAGC 
226151 TCCACATCTT TGTCTGGGGA 
226201 GTGTTTGGGC AAGGGAAGTT 
226251 GATTCAATCG TTCTATGAGC 
22 63 01 GGTAACTTGT GCGCTTTGGA 
226351 TTTCATGCCA ACGTACACCT 
226401 CAGCAAGCCC CGGAGAGGTA 
226451 TGAAGGATTT TTTCAGCTGC 
226501 CTAAAAAAAG ATATCATTTT 
226551 TATATAGATT ACAGAAGTGA 
226601 GAGGTATCAT TAGCACATCT 
226651 GCGCACATAG GAATGGTTAG 
226701 CACAATGACC AAGGCGGCTT 



PCT/US99/26923 

GGTTGATTAT CGAGAGTTCA GAATAGGGAT 
TCGATCTTAT GAAAGCATCT ATCAATTTGT 
TTTTGCGGAT AAAGAGGTTC CCAGAACAAT 
GCTCTCCTTC GATTGTTGTC GTTTTTTGAG 
CATAAAAGAA CTAAGAAAAA TTTTGGTAAC 
AGCATAGCGA CAGTCATGGG ACCAACGCCT 
TGCGCATTTT GTCACAACGT TATTAAAATC 
AGCCTTTCGC ATTGTCTGCA GGGACTCTTG 
ACAGCATGTG GGGCTACCAT AGTTTCCTTT 
AGCAGCAATA ATGATATCAG CTGTCTTTAA 
ACTGGCTATG AAGAACTGTG ACTGTACAGT 
ATCATGAGGG CCGCTAAGGG TTTCCCCACG 
AGCGGCATGG CGGCCTCGAA GAGGAATTTC 
TAATTCCTGC AGGAGTGCAG GGTAGAAGTC 
AACTTTCCCA TGTTCACAGG GTGAAGCCCG 
GATCGCTTGG AGAATCACTT CGCTGTCCAA 
GCACGAGGAT GCCGTGGATG CTAGGATCTT 
TTAAGGACTG AGGAGAGGGT AGAGTCAGAG 
GATAATTCCG ATTTCTGTAG CTTTTTTGAC 
CAGATGCGGG GTCATTGCCA ATCAGGACCA 
GGACTTTGTG AGATTTCCTC TTTGAGTCTC 
AGGAATCCCT CTCAGTAACA TACCAATCTC 
ATGGATTCCT GAGTTTATTG CGCAAGGAAG 
AGATGAGATT ATCTAAGAAA GAACATTTTA 
ATTGACCAAC GGACTTCATG TCCGAACAAG 
GAGTAAAAAG ACTCCATAGA AAATCATATT 
TGCTCTCTTT TGAAAAAATG AGGTGTGCCT 
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225751 
226801 
226851 
226901 
226951 
227001 
227051 
227101 
227151 
227201 
227251 
227301 
227351 
227401 
227451 
227501 
227551 
227601 
227651 
227701 
227751 
227801 
227851 
227901 
227951 
228001 
228051 



TATGGGTGAT 
CTTCCTAAGG 
TCCCTTATCT 
AAAGGATCAG 
AAAATACACA 
TTACGCTTTG 
TATAGCAAAG 
GCTTTTAAAC 
TTATTCTTAA 
TTGTTTTTGT 
GCTTTGGCTC 
TGCGCGGATA 
AGAAAACTCT 
CTCTTAGATT 
TGAGAATGGG 
TCCGTCTTTA 
CCCGTGATGG 
TGTCTCGGTA 
GAGTGATCCC 
GTTCTCTGTC 
TTTTTTTTGT 
AAATTAATTA 
TTTAAATTCA 
CCTTCTTCGG 
AGAAGGCCCG 
CCTCCTCTGC 
ATGTATGCAA 



CAGAGCAGTT 
CAAGATCACT 
TTTGAGAAAC 
ATGTGCTTTA 
AAATAAAGCA 
AATTATGATT 
CATAGGATTC 
AATTATTTAT 
TAAGTTTTTT 
ATTTTTTATG 
AGGAGATTCA 
AAAGCTAAAA 
TAAACACCGT 
CTCAGAAGAA 
TTTGTTTTTA 
GGCGGGGTTA 
AACGACAGGA 
TTAAGATCTC 
TCTAGCTCGA 
CTCCCATAAG 
TTTTTAGATT 
TTTTTTGTTT 
GAGAATCATT 
ATCCAGGAAA 
TCTCCGCTAA 
TGCGAAGCAG 
CAGAATCTCA 



AGAGGGTAGA 
TAAGACACCA 
AATAGAAACA 
GGGAAGAGCG 
TTTGCGCGGT 
ATCGATTTCA 
TTAGCTTGAT 
ATAATATTTT 
AATTAAAATT 
GGGAAGCCTA 
AAAGAAATCA 
ATCGTCGTAA 
GCTCAAGAAT 
GGACACCGAT 
CTGACAAGGA 
AATTAAAAAA 
AGCCTGGGTA 
CAAAATCTTA 
CAGCGATGCA 
TTTTTTTACG 
TTTAAAATTT 
TAAAATTATA 
ATGGCAGTTT 
GTGGAATCCT 
AAGAATCTAT 
GAAAGCTTAG 
GATAAATAAG 
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GCAGTAGAAA 
CAGCCTAGAG 
AAGAACAATA 
TTGGAAGGGT 
GTTATGTGGA 
AAAGGTTTTG 
TTTTTATGTT 
ATTTTTAGAT 
AGTTTATTAA 
AGAAGAGCAG 
ACGGAAGTGT 
ATTTCTTATT 
ACGATCAGTT 
AAAGTTTTGA 
CCATTTTAGT 
AAATCGCAGA 
ATTAATTTAG 
ATACCTAGAC 
CTTTGTTGCA 
CAAGGTGTTT 
GTTTGTAAGT 
AGTAGTTATA 
CAGGTGGCGG 
GCTCTGCAAG 
ATTTTCTGAA 
TGCGTTCAGG 
GCTAAGTATC 



TACACCGCGG 
CGAGTACCAG 
TAAGGGGAAA 
GAACATGGAT 
CAGACATGGT 
CAGACTTGTA 
TATCTAAACT 
TAAAATTAGT 
GTTGATTTTT 
AACGGATAGG 
TGAAGAAGCC 
GCTAAGGAAC 
AGTTCGCTCT 
TTTTCAATTA 
AAGTACTCTA 
ACGGGACTAT 
AGTTAGGAAC 
AGTCGGATCC 
TCTACTCGCT 
CACCTTGCGT 
TGTTTTTTTT 
AGTTTTATTT 
AGGGGTTCAG 
GAGAGCAGGC 
ACCAAGCAGG 
ATCTACAGGA 
GTAAAGCTCA 
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228101 AGATCGATCA TCAACCTCTC CAAAATCCAA ATTGAAAGGT ACATTTTCTA 

228151 AAATGCGCGC TAGTGTGCAA GG ATTCATGT CAGGATTCGG ATCTCGGGCT 

228201 TCGAGAGTGT CAGCAAAGCG TGCTTCCGAT AGTGGTGAGG GAACATCCTT 

228251 ATTGCCGACA GAGATGGATG TTGCTCTAAA GAAGGGAAAC CGTATTTCAC 

228301 CTGAAATGCA GGGATTTTTC TTAGATGCTT CGGGTATGGG AGGGAGTTCC 

228351 TCTGATATTT CTCAGCTTTC TTTAGAGGCT TTGAAATCTT CAGCATTTTC 

2284 01 AGGTGCCAGG AGTTTAAGTT TAAGCTCTTC AGAATCTAGT TCCGTGGCTT 

228451 CGTTTGGATC TTTCCAAAAG GCCATAGAGC CTATGAGTGA GGAGAAGGTA 

228501 AATGCTTGGA CAGTGGCTCG TTTAGGAGGG GAGATGGTCA GCTCTCTTCT 

228551 CGATCCCAAT GTTGAGACCT CATCATTAGT GCGCAGGGCA ATGGCAACAG 

228601 GCAACGAAGG CATGATAGAT CTTTCTGATT TAGGACAGGA AGAGGTCAGT 

228651 ACAGCCATGA CATCTCCCAG AGCAGTAGAA GGAAAAGTAA AGGTATCTTC 

2287 01 TTCTGATTCT CCAGAAGCGA ATCCAACAGG AATTCCAAAT TCTAATACTT 

228751 TAGAAAGGGC GG AAAAGGAA GCAGAGAAAC AAGAAAGTCG AGAGCAGTTG 

228801 AGTGAGGATC AGATGATGCT TGCACGTGCT ATGGCTGGGC TTCTTACAGG 

228851 GGCAGCGCCT CAAGAGGTAT TGAGTAATTC TGTTTGGTCT GGTCCTTCTA 

228901 CAGTATTTCC TCCTCCCAAG TTTTCAGGAA CTTTACCCAC CCAGAGATCG 

228951 GGAGATAAAT CAAAGCATAA ATCTCCAGGA ATAGAGAAGA GTACGAACCA 

229001 TACGAACTTT TCTCCTCTTC GGGAAGGTAC TGTGAAGAGT GCTGAGGTTA 

229051 AAAGTTTGCC TCATCCAGAA AGTATGTATC GTTTTCCTAA AGATAGCATC 

229101 GTTTCCAGGG AGGAACCTGA AGCCGTTGTT AAAGAATCTA CGGCATTCAA 

229151 AAATCCAGAG AATAGCAGTC AAAACTTTCT CCCTATTGCT GTGGAGAGTG 

229201 TTTTCCCTAA GGAAAGTGGT ACGGGAGGGG CTTTAGGAAG TGATGCTGTG 

229251 AGTTCCTCAT ATCATTTCCT TGCGCAACGT GGAGTGTCTT TACTCGCTCC 

229301 TCTACCTCGT GCTACTGATG ACTATAAAGA GAAGCTCGAA GCTCATAAAG 

229351 GTCCTGGAGG TCCTCCAGAT CCTTTGATTT ATCAGTATCG AAATGTTGCT 

229401 GTTGAGCCGC CAATTGTTCT CCGTTCTCCC CAGCCGTTTT CAGGATCTTC 
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2294 51 ACGTCTATCG GTTCAAGGAA AGCCTGAAGC TGCTTCAGTT CATGACGATG 

229501 GTGGGGGGGG AAATAGTGGT GGTTTTAGCG GAGATCAAAG AAGAGGATCT 

229551 TCGGGCCAGA AAGCTTCCCG TCAGGAAAAG AAGGGAAAAA AATTATCTAC 

229601 GGATATTTAG GGTTTTAAGT CGGTTTGATG TGATGCGTAT AATTCGTTTT 

229651 GATCCTTATG GTGCGCTATC TGCACAAAGC ATAGCTAAAG ATTCCCGTCA 

2297 01 AAACTCTCCT TTAGTAGAAA AAATTTCTGA GGAAATTGCT ACGAATGAAG 

2297 51 CGATTCGGCT TGCCTTGCTA GCTATTGGAG ATCGCGAACA AGAGGAGAAG 

229 801 AAACAGAGGC ATCGTTATAA GCTACTCGGA CAAAAGCAAG CCAAGGTCTT 

229851 GCTTTCTCAG TTGCGTCATG TGCATTTAGA TTTTAAAAAA CTATATTGCG 

229901 ATAGTAAGAA AAAAGAAGAT CAGGAAAAAG ACGAAAAAAA CAAACAGAAG 

229951 CGATCTATTA AAGTTACAAA GAAAAAAAAG GGCATCTCTT TAGGGGCTGC 

23 0001 CGCTTCTCAG GCAATTGCAG CAGCAGCAGA AGCTTGGGTA ATTGCTAGAA 

23 0051 ATAAAGGAGT CTTAGAAACT GCCTCCACTC TTTTTTATCA AAAGGATGAA 

23 0101 GAGGCCTAGA CATACTTGAA TCACGAGCTA GGCCTTCTTC TGTGACTATT 

230151 TTAGATCATC ACGCTGCTTC TTGCTCTTCT ACTGGAAGAC TTTTTTCTAT 

230201 AATTTCTTTT TCTTCATCGT CTACTGTAGG ATGTTCTGAA TGCTTAGCTA 

23 0251 GATCTTTCCA AATCATTCGA AGTTTCTGAT TTTGTTGTTT GGTCAGGTTT 

23 0301 TCCAGTATGA TCAAGCTTTC ATCATTTAAG GAGAATCTTC CTTTAGACCA 

230351 ATTTATAGAT CCTGCAAGTA GAGTTTTATT ATCTATAACT GCAAACTTAT 

230401 GGTGAAGAGT ACAGGGTGCG GTATTTATAG AAACAAAGTC TTTATTGATA 

23 0451 TTTAATTGTC GTAATTGCTT AAAAGTAAGT TTGCTATGAC TTCTATCAAT 

23 0501 GATAATATCT ACATGGATTC CTCGTTGTTT TGCTTGATGT AAGGCTTGAA 

230551 TAATCTCCGA GTGGGTCAGA GCAAACATAG CAACTTGGAT GGTTTTCTGA 

230 601 GCTGTCTGGA TTTTTTCGAG TACAGCTTGT ATTGCAATTT TACGATCTTG 

230651 AGGAAGAACA AAATACTTTC CTGTTTGATC CTTTATAGAA AAGTCTCCAG 

230701 AGGTATTTGT GATAATGAGA TCACAGAGCT CCGAGCTATG CATTCCTAGA 

230751 ATGAGATTAT TATCTAAACG TAGAGAAAGA TTGGTGTAGT TCGCAGATCC 
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230801 TAGCCAAGCA TCTTTCTTAT CTATGGAAAG AGCTTTTTGA TGCATCAGTT 

230851 TACGCCCTGC TGGAGGTTGC TCGACTAAAG TTACATTGCT GGCTTGCTTT 

230901 AAGATTTGGG GAATTTTAAA TTTTTGATAG TAGATCGTAA CTTTGTTTTT 

230951 TGCTTGAGCT TGTCGAGTTA AACTCTGTTG GATCTTGGGT TCTGAGAGGT 

231001 TATAAATACG TAGGAAGATC TCTTCATCAG CGTGTTCTAT AGCATCGCAT 

231051 AGAATTTTAC GCATGTCCTC ATTGCATTGA TTTGAGTAGA TGATAGCTTC 

231101 TTCAGACTTT AAAAAAGTCT TAAAAGTGTC ACCACGAGGA GCTCTTGCAA 

231151 AAATTCCTAC TAAAATCAAC GTGCTAATAA TAACACAGAT TTTTAATTTA 

231201 TCTTTTTGTC TTTTATTCAT TTTTTACTCT TAAAATAGCT TTTTTGTTTT 

231251 TATATATTTT TAAAAATAAA AATAACTAAT TTAATTAAAC GTTTAATAGA 

2 313 01 ATAAGATTCT TATTAAATAT AAAAAGTTTC TTTATTGGGG AAATATGCGT 

2313 51 GCCATCGTTC GGAAACCTTT TGTTTTGTAG ATGGGTCGAC CTCTACTTCT 

2314 01 TTAGGATAGG AAGGCTTCAT CAGGGCATCG ATAACGAAGG GAAAGTTGTA 
2314 51 ATTAGGACGG TGAGTAGCAA AATGGCTGTG GAGCGCGTGA AGATCATTTG 
231501 CTGGGGCACA TCGTGTGAAG GTCCTCCAGA GAAAATCTTT TTCACTTTGA 
231551 ATGGTTTCTC TCAGATTATC GGCAAGGATA ATCAGAGGCC ATGATTTTAG 
231601 ATCTGGATGG TGAAGGAGAG ATTTAATACA TCGGTCCTCG AGGGATGTTT 
231651 CCAACACTAG GCAACCACGA CAAAAGGGAG CGATGTCTTG AACTCCATGG 
2317 01 ATTTTTCCTC CCTGATATCC ATGGGGAAGG TCTCGGATGG CTTTTCCTAT 
231751 TCCCATGAAG ATTCCCTTGG AGCCCTTATT TAAGCTTGGT CCTGTATAGT 
231801 CTAACGTATC GTTTGCAGTT TCTGAGAAAA TAATAAGATC TCGGTCTGGC 
231851 TGTAGACGCT CTAAAATGGT TTCTAGAACC ACGGAGAACC TGTCGAGAGG 
231901 CACCTCTTGG TCTGTGACCA TTAGGAATTT CGTTAGGGAA AGTTGGCCCT 
231951 CTCCAAGAAT TCTAAGAGCT GTGGTTAGAG ATTCTCTCCA ATAGCGTTCT 
232001 TTAACGACAG CGGCAGTCAG TGCATGAAAC CCTGATTCTC CGTAACTTTT 
232051 AAGTCTACGC ACACCAGGCA TAACTAACGG AAATAAAGGG GAGAGGTATT 
232101 CTTGGAGTTT GTTCCCTATA TAAAAATCTT CTTGGTAGGG TTTGCCGACT 
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232151 ACTGTAGCAG GATAGATTGC ATCTTTTCTG TGATAGATTT TATGACAGTG 

232201 GAATTCAGGG AAGTCATGTT GGAGACTGTA GTATCCGAAA TGATCGCCAA 

232251 AAGGACCTTC AGGACGACGT TTCCCGGCCG GAGATTCTCC GACCAGGATG 

2323 01 AATTCCGCAT CGTAGAGTAG AGGGTGGGGA TGGTCGTTTG TTTTTTTATA 
232 3 51 AAGGAGCTTC GCTCCTTGGA GGAAGGTAGC AAAGAGAAGT TCCGAGACAT 

2324 01 TCTCAGGTAG GGGGGCAATC GCAGAAAGGG TTAAAAAGGG GTTTCCAGAC 
2324 51 AAAAATACCG AAACAGGAAG GTTTTGCTTT TTTTGCTCTG CTTCATACAG 
232501 ATGCATCCCT CCGCCCTTCT GGATTTGAAA ATGGAGGCCC ATGGTGTTTT 
232551 GATTGAACCG TTGCACGCGA TACATCCCAA GATTGGGTGT AGTAAGAGTC 
232 601 GGCGATTCCG TATAGACAAG AGGAAGTGTG AGAAAGGCTC CACCATCTTC 

232 6 51 AGGCCAGCTT GTGAGTAAGG GAAGGTGATC TAAGTTAACT GAGGACATAG 
2327 01 AAACAAAAGG AAAGCGACGG AATCGAGCTT TTTTGAGCCC TAAAGAGCTT 
232751 ATTCTTTTTA ATAGATCCCG AGATTTCCAT AGAGAAGAAA GCTTTGGTGT 
232801 AGAAGAAATA AGGTGGGCAA CTCGAGCGAT GAGGTTATCA GGAGCTTGAG 
232851 AAAAAAGTTG GTCTACACGA TGTTTTGTTC CAAAGAGATT GGTCAGGACT 
232901 GGGAATGACG ATCCGATGAC ATTATGAAAA AGAAGGGCAG GGCCTTGATC 
2329 51 TTCAATAACA CGACGATGAA TCTCAGCTAA CTCGAGGTTA GGACTTACGG 

233 001 GAGCAAAAAC ATCAATAAGT TGTTTTTGTG AACGAAAAAG AGAAATATGA 
233 051 CGCCTTAAGA AAGACATAAA TATTTCCCTA TACTTAAGTT AAATTAAAAA 
233101 TTTTTACTTT TAGCTCTTTC GAGAACTTTC TCTAATCCGA GCTTATCAAT 
23 3151 GTGACGAAGA GCGCTAGCAG AAATTTTAAG CTTAAGAAAA CGGTTTTCTT 
233201 CTGTAGACCA TAGACGCTTG GTCAACATAT TAGGGAAAAA TCTTCTTTTA 
233251 GTCTTCCCTG 'TCACTTTCAA ACCAATTCCT TTTTTCTTTT TAGCAATACC 
233301 TCGAAGTGTA TAGCTATAAC CACGGCGAGG TCTCTTTCCT GTAAGTGGGC 

2333 51 ACTTTCTTGA CATATTCTTC CTATGAATTC TCTAACATCC GCTCAATAAA 

2334 01 GCAACTCTAT GAAAAGGAAT CTATACTAGA GCTTTTCGAA TATATGGGGA 
2334 51 AGAGGTTTTC TTAGAAAATG TGATTGTCTG TTGTTGATAA TGAAGGAATT 

173 



wo 00/27994 



PCT/US99/26923 



233501 CTTTTAAGAA AGACGGTCGT TCCAGGAAAT TATTATAAGG TTCTTACTAT 

23 3 551 AAAAAGACTT AAATGTTTTA TTGCTATCCT TACAGTCCTG TAAGGATCTT 

233 601 CTCAATGTAA CCATTAAATT TTTTATGAAT AGCGAGTTCT TCTAAGGAAG 

233 651 GCCGAACTCG ATACGACCAA TTCTTTTTAG AAATTGTCCC AGGTGTATTA 

2337 01 ATGCGTTCTC TTTGTAGATT TTTTGATACT AAATCAGGGC AGAGGGCGAG 

233751 ATAATCGTTA AAGAGGTTGA TATGAAAGAT AGATGCTGAT TCATGAGAAA 

233 801 GTTTTAAGAT GTCTATTTGA GTTTCTGTAG TCAGGGTTTT TTGAAAAGGA 

233 851 AGATGTAGAA ATTTAGCAAA TTGCTTAGCT TCCTTAGGTG AATTGAGCCA 
233901 CCATTGGGCA AACGTATCAG AGTCGTGGGT AGAGAGAGTG GTCACAGAAA 
23 3 951 GTGGATTATA ATCTTTTAGG GGAATGAAGG CACTGTCGCT TTCCCAGTTG 
234001 CGTTCCCATC GTGGAATCCG GGTTCCACAG ATTCCTAAGT GTGTTAATGT 

234 051 CGTTTTGACG TCTTGGGGTA TAATCCCTAA ATCTTCTCCG ATAGGTAACA 
234101 TAGAAGAGGC TCCGAGCATA GTAGAAAGGA TCTCCGTGCC CTGCTTTATA 
234151 TAGTCTTTAG GATTGTCTGG AATGAACCTT CCTCTTCCTG AAGAATCCCA 
234201 AATCCACAAA CGGAAAAATC CTATAATATG ATCTAAGCGA TAGACGGAAT 
234251 AGAAGTTTTG AGCATATCGC AGACGCTCTT TCCACCAAAT GTAGTCGTCT 

2343 01 TTGGCAAGTT GTGAAAAATT ATAAATAGGC AGATGCCAGT TTTGTCCTTC 
234 3 51 AGAATTGTAG AGGTCAGGAG GAGCTCCTAC AGACCTTGAT GAAGAAAAGT 

2344 01 AGTCTCGGAA ATACCAAACA TCACAGCTAT CCTTGCTAAT AAGAATAGGG 
2344 51 AGGTCTCCTT TAAGCAGGAC GTGGTGTTGA TCTGCATAGG CTTTCACTTC 
234 501 GCAGAGCTGT TGGTAACAGA GAAACTGTAG ATAGGAAAAA AAGAGGACTT 
234551 CATCATGGAA TTTTTTAGTT AAGTCCGGAA AATTCTCCTG ATCTGTGAGC 
234 601 GACTTCGGCC AGTTATTAAT AGGTTCTCCG TGCATATGAT GTTTGATTGC 
234651 ACGAAAGGTC CCATAGGGAT AAAGCCAATA GCGCTCGCTT TCTAGAAACT 
2347 01 CAGAAAAATT TGAGTTTCCT TCGAGGGAAG ACTTGCAACA TTTTTGGTAG 
234751 TACTCTCTTA AGAATGCCCA TTTTTTTTCT TTAACTTGAG TATAGCTGAC 
234801 TGATGGAGTC GAGCATAACT CATGCATATC TTGAAGTTTC TTGGCAAGTT 
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234 851 CAGGGATGGT ATCGATATTT GGAAGAGAGG ATAGGGAAAG GAATAGGGGA 

2349 01 TTCAGGGCTA CGGAAGAGAT GCTGTTATAG GGACTCGTAT CTTCACCAGT 

234951 ATCATTTAAA GGGAGAAGCT GAATAACGCT GAAGCCCTGT TTTTGGCACC 

235001 AAGAGATCAG AGGAATGAGA TCTAAAAATT CACCGATTCC ACAGCTATTT 

235051 TTTGTGTGTA TTGAAAATAG TGGGAGATAA ATCCCGTGTT TAGGAGAGGT 

23 5101 TCCTATAAGT TTCCAAGCAT GTGCTGAGGG TGAGTGTTTT GTGTATTTTA 

23 5151 AAACATTCAC GCGACGTAGT AGATTCCCAA AACATGAAGG TCAGGAAGAT 

235201 TCCCTGTTCT TAGGGATTCC ATCCAAATCT TAGCATGCAA AGAAAAGAGC 

23 5251 TTGAGATACT CGAATAACTT TTCTCCATTT AGGTAGTTCA TACTCAGGAT 

23 5301 ATCTGAAAGA TAGAGCTGTT GAGTGACCTC ACCGTAGCCT AGAATTCCCT 

23 53 51 TGATGCTGGA TTGGAACGAG CCATTTACAG AGAGAGCAGC TTTGAAAATA 

235401 CGCTCGCGAA ATACGTTTTC AGGAAGAGTA CCTAGTAGTG TCGATACTGC 

23 5451 AAGATCTCCG GAATTTCCAT CTTCTTCTAT TTGCACAGGG ACATGGGTAT 

23 5501 CGCTGAAACG GATTAAACAA GAGTTATTTT TATCTGGAGC AAGTGTCGTA 

235551 TTTAATAGGG GTGCTAAGGA TTCTAGTAAT TGCTCGTATT GGTTTTGCAT 

23 5601 GGCAATCCTT TTTTAATCAT GACCAAGGAT AGGGTTTAGG GAAGTCTGAT 

23 5651 GCTTTGGGAT AATCTTCATT GTTTATATTT ACAGCATCTA AAGCATTAGC 

23 5701 AATCATAGCT CCTAATTGCT GACGTTTGTC TGCTGAAGAG AAAAGGCGTG 

23 57 51 ACGACGTTTG ACGTAAAGCA GAAAAGAATA AGTTCAAGAC ACCGGTCACA 

23 5801 GAATCAACAT CGTCTCCTAT GAGATTGCGG ACTTCTCGTT CTACTTTAGA 

235851 TGCTGTTGGG AACTTATCGT TAATGATTTT ATGGTAGGAC TCAGCTACCT 

23 5901 TCACAAAGTT TAGATCAGAA GGAGTTTGGA TTCCCTCAGC TTTTAAGCTA 

235951 TCGAGTAAAA TAGGAACGCG ACTTTCAAAG TAATCGTACG AGGTAAGAAC 

23 6001 TGCTTGCAGG TTACGAGTTT CTGTCATGAG AACTTGTAGT TGCGCACTGG 

236051 GTACGTAGGG ACCCTGCCTT TTTAATTCTG TTQCCATTCC TTTCATTAGA 

236101 AAGGAGCTGA CAATAGCCAT ATCTTGGTAG GTATAGCGGT CTTGAAGCAT 

236151 AGAAAGTAGC TGATCACAGG TATGTGTGTC TCCAGTCACT TCTAAGTACA 
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23 6201 AAGAGCGAAG CCCTGAAGGA GAAACATTCA GTTGGTCTGC ATATTCTTGA 

236251 GAGGCAAATA AGATGTTTTT CGCACCAATA GCAGTTCGTC CGAATTGCTC 

23 6301 CGTATGAGTA TTCCTTGCTT GGATAAGCGC TTCTTTTAAT TTACCTTGGG 

236351 AGGGTGGAGT CGTTTGAACC AGGTAGTCCA AAGCTGTGGA TTGCAGAGCT 

2364 01 GGGTCTTTAA TTTTCTCTTG TACAAGAGCA AGAATGTCTT CTGGAGAAGC 

236451 ATCGTCTCCT ATTGCATCAC GCAGGCCGCG AAGTTCTTGA CCAGAGATTT 

236501 CAGAATTCCC AGAAGCATAC TTATCAGCAA GATCTGTGTC AGGCTTCTCT 

236551 TCTGTAGATT CAGATTTTTT CTCAGCCTTT CCAGCTTCTC CTTTTTTCCG 

23 6601 AGATTCTAGA GTTTGAAACT TCTCTTCCTT TTTTTTCGTG CGTGTTGCTG 

23 6651 CTGCGGGATT TGTCAGGTCC TGAGATTGTT GAATCATGTT CATCTCAGAA 

23 6701 CCTTCTTGGC TGGCTACAAC TTCTGCTGCA TCTGCTTTTG CAGCTGCAGC 

23 6751 TTCTACAGCT GCAAGGTTGA CACCCTGAGT GCCTCCTAAA CCACCTGTGC 

23 6801 CTCCTGATGC TGCCATATGC CTCCTATGAG CGACAACGTA TCAATTAGAA 

23 6851 AATCTGAATT CTTCCTAAAG GCTGGATGCG GATTTCTGGT AGGATTTCTT 

236901 GATAAGAAAT CACAGCAATG TCAGGGAATT CTGTTTCTAT TAATTTTCGT 

236951 ACATATCTTC TTACATCAAT TGCTGTCAAT AATACTGGTG GTTGGCCTCC 

237 001 TGCAGGTGTT GGCGTGATCG TATTCCTCAT AGATTTTAAA ATTAGGTTCA 

237051 CAGAATCAGG ATCTAGAGCA AGGTAAGAAC CTGCCGATGT CTGTTTAATT 

237101 GCTCCACGAA TCATCTCTTC AATTTCTGGA TCTAAGAGAT AAACAGAAAT 

237151 TGCTGATTGT CCTTGAGAGA ACTTGAAGCT GATATAAAGC TTTAAAGAAG 

237201 ACCGTACATA TTCTGTAAGC AAAACTGTAT CTTTCTCAGT TTGCGCCCAC 

237251 TCGCTCAGAG ATTCTAAGAT TGTACGTAGG TCTTTAATTG AGATTTGCTC 

237301 TTGAACCAAT CTCTTAAAGA TTTCCGTAAG CTTTTGCAAT GGAATAAGCC 

237351 TTGTGACTTC CTTCACTAAG TCCGGGAATG AACGTTCCAT AAATTCGATC 

2374 01 ATAGAACGTA CCTCTTGAAT TCCCAAAAAC TCTTGAGAGC TTTTATGGAA 

237451 AAAGTACGAA AGATGGAGAA TGATCACTTC GAGCGGCGTC CAATATTTAA 

2 37 501 TTGCTGCCTT CTCTAGAATA GCTTTTGCAT CTTCACTAAG CC AAGCTGAA 
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237551 GGAAGACCCG CAGCATTCTT ATAGGTAATG AAAGGTAGAT TATAACGGCT 

237 601 GAGATTGTCC TCCACCTCAT TGGTTAACAC ATGGTGCGGA GGAATTTTTC 

237651 CTCGCACATA AGGGACTTCA TTAAGCAGAA TCATATAATC GTATCCTTCT 

237701 AAAGAAGGGG AATCTGTGCG AACATGAATG CCAGGGTATC GGATTCCGAT 

237751 ATCCTGATAG AGAGCTTGCC GCATTTTAGG AATCATATCA TCAACAAAGC 

237801 TTTGTCCTGA TTTTGTCTTG TGTTGGATAA GCTTAGAGAG ATCTTTTCCA 

237851 AGTTCTAGAA TTACGGGAAG AGTTAGAGAA TAGTCATCGG GATTATCCCC 

237901 AACAGTAGCA GCGCCATCAC CAGCAGCCCC TACGGTTGTT GAAGCTCCTG 

2 37951 AGCCACCACC TTTTTTTCCT GCCGCTGATT TCTTAGTCAG TAGGAGAATC 

238001 CCTAAGGCAA CGAAAATTAA TGCTAAAATG GAGAAGGACC ATAGAGGGAA 

238051 GCCCTTGAAG AAACCAACCC CTAAAGTTGC AGCACCTGCA AGGAGTAGTG 

238101 CTCGTGGTTC TTTAACGAGC TGAGTAGAAA TCTCTTTACC CAAGTTCGTA 

238151 TTTTTGTCAC TCGATACACG AGTCGTGACA ATACCCGCTG TCAACGCAAT 

238201 CAAAAGAGAA GGAATTTGAG AGACTAAACC ATCTCCAATG GAGAGAAGAG 

238251 TGTAGACGTG AGCTGCTTGA GCGAGGTCCA TGCCGTGCAT AGCCACCCCA 

238301 ATCGTCAAAC CGCCAACAAT GTTAATCAAA GAGATAACGA TACCAGCGAT 

238351 AACGTCTCCT TTGATGAACT TCATGGCACC GTCCATGGCT CCGTAGAGTT 

238401 CACTTTCCTT TTGGATTTGA GCCCTTTTAT CACGAGCTTG TGTGGCATCA 

238451 ATCATACCAG CTCGTAAGTC CGCATCAATC GCCATCTGTT TACCTGGCAT 

23 8501 CGCATCCAAT CGGAATCGGG CAGCAACTTC GGCAACACGC TCGGCACCCT 

238551 TAGTTACTAC GATAAACTGA ATGATTGTAA TAATGAGGAA GATAATGAAC 

238601 CCGACCACAT AGTTCCCTCC AACCACGAAG TCTCCGAAGG CCTGAATGAC 

238651 ATGACCCGCA TACGCTTTAA GGAGAATCTG TCGAGAAGAG GAAATATTAA 

238701 TCCCCAAGCG GAACATCGTA GTGATGAGGA GCAACGAGGG AAAAACAGAC 

238751 AGCTGCAAAG CACTTGGAAT ATAAAGAGCC ACCATCAATA AGAATACAGA 

238801 GATCGATAAG TTGATGGTGA TCATCAAGTC AACGATAGGC GGAGGCAAAG 

23 8851 GAATAATGAT CATTAAGACA ACGCCCATCA TCCAAAGAGC AAGGATTAAG 
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238901 TCGCTGGACT TATTGATCAT GTTTAAGGCG GTATCGCCAC CAAGTGTTCT 

238951 GCTGACGAAA TTGAGTAGCT TATTCATTAT AAATGATCAG GTTGGTTAGT 

239001 ATTTTTATTA TTAGGATTTT GCGCATTCAG TGAAGTGATA TAGAGTAGAA 

239051 TTTCTCCAAT AGCTTCGTAA GTAGATTCTG GAATAAATTT TAATTCCTTC 

239101 CCTTCATCCA AAAGCTGATG TGCTAAAGGT ACGTTTCGCA TAATGGGAAT 

239151 TCCGTACTTT TCAGCTTCAT CAAGTATCCT TTTAGCTCGT AAGTTGATGC 

239201 CCATGGCAAT GATCCAAGGT GCTTTATATT TTTCAGGCAT GTAGCCAATA 

239251 GCAACAGCAA TATCTTTGGG ATTAGAGACT ACGGTGCTTG CATGTTTCAC 

239301 CTGTGATGAC GAGTCTTCAT AGGCAATTTC TTGAGCAATT TGTCGACGAC 

2393 51 GGCCTTTAAT CTCAGGATTT CCTTCCGTGT CTTTAAACTC CTGCTTAACC 

239401 TCAAACTTCT CCATCTTTAA TTCTTTAGCG AAATTGTGGC GCTGATAGAC 

239451 AAGGTCAAGA ATCGCAACAA TCAAAAAGAA AATTCCTATC GAGGTTACTG 

239501 CTTTATAAAA AATTTCTTTG AAGATTTGAG CAGTAATTAT AGGAGAGACT 

239551 CCTGCAGTTT CTATAATTAA AGAGACTTTG CTTTTTAACG TTATGTATAA 

239601 AATTAAGGCT GCTCCAAAAA TTTTTAAAAT CGATTTGATC AGCTCTATGA 

239651 GAGTCTTTAT TTTAAACTTT TGTTTGATGT TCTCAATAGG GTTGAACTTC 

239701 TTGATATCTG GTTTAAAAAC TTCGGTAGAA AATGTAGGAC CAACGATAAG 

239751 AAAACCTACA ATGACGCCAA CAACAGCAAC AGCTCCCAGT AAGGGAAGTG 

239801 ATGCTGTTAA AATAAGCATA AGACAGTTCT TAAGATAAAA TAAGGTAATT 

239851 ACAGGATCAT GGCGAGTGGG AGCTTGTGAG AGCATGGAAA CCAGAAAGCC 

239901 ACCTAAATGC TTGAAAAAAA AGGTCGATAG GGAGAAAGCC GTAAACATAG 

2399 51 AGACGATAAA GGTAACCGCA GAAGGAAAAT CCTGAGATTT TGCTACTTGA 

24 0001 CCTTTTTTCC GAGCATCTCT AAGTCGCTTC GGCGTGGCCT TTTCTGTTTT 

240051 TTCACCCATG CTGTTGCCAA GGTTCGATTA AGAGGACTAT ATTCGGAATT 

240101 ATCTGCAAAT TACCTTATGC TCTTGTTTTT CTCAAGAGTG AAGCCCAAAC 

240151 GTTTTGTTCT TAGTAAGATT ATAAATGAAT CTTAGAGATA AAGAAGAATA 

240201 GTAGAAAAAA AGAACAATAA CGAAAAGTTG TTGAATTTAG CAATACTTTA 
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24 0251 GGTCACTTCA AGATGGTTCT 

2403 01 TCAATCGATT ATAGTGGTTT 

2403 51 AAGGTACTAT TTTTCTAGCT 

240401 TTTCTAATAT TCGAAAAAGC 

24 0451 TGAGTCATAC TGAATGTGGA 

240501 GGCTTATTCA ATGCTTTAAC 

240551 GTTTTGTACT ATCGATCCTA 

24 0601 GACTGGAAGC CTTAGCTAAA 

24 0651 GATATGAAAT TTGTAGATAT 

2407 01 CGCGGGTCTG GG AAATCGGT 

24 0751 TTGCTCATGT AGTGCGTTGT 

24 0801 GGAAAAGTCA ACCCTGTTGA 

24 0851 TTTTTCTGAC TTCTCCTCAG 

24 0901 TAGCCAAAGG AAAGCGTGAA 

240951 ATTATTGCTC ACTTAGAAAA 

241001 TCCAGAACAA ATTGTGGCAT 

241051 CTATGTTTTA TATAGCTAAT 

241101 AATGATTATG TTGCCGCTGT 

241151 AGTGGTTCCT ATCTGTGTTC 

241201 TTGAAGAGCG CTTAGAATTT 

241251 CTTCATAGAT TAGTGCGTGC 

2413 01 TTTTACTACA GGTCCTCAAG 
241351 CTTCTGCTTG 'GGAAGCTGCT 

2414 01 TTTATTCGTG CTGAAGTGAT 
2414 51 TCGTGCAGCT GCTCGAGAAT 
241501 ATATCGTCCA GGACGGTGAT 
241551 CCTTTGCAAA CCAATCTTGA 
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ATATGAAGAA AGTATTAGGT AGGGAAAACT 
TTTATCATTT TCAAAGCACT CTTGATTTTG 
ATTGTGCTTT TTTGAAACAG AAGAGTTGTC 
ATGTCATTAT TTTTATTTTT AGATGTCTTA 
ATTGTAGGGC TTCCTAATGT AGGAAAGTCT 
AGGAGCTCAA GTTGCCTCCT GTAACTATCC 
ATGTGGGTAT TGTTCCTGTT ATCGATGAAA 
ATTAGCAATA GTCAGAAGAT CATCTATGCG 
TGCAGGTTTA GTTAAGGGAG CTTCCGATGG 
TTCTCTCTCA TATTCGAGAA ACTCATGCTA 
TTTGATGATC CAGACGTTAC ACACGTTTCA 
GGATATTGAA GTTATCAACT TAGAGCTCAT 
CAAAAAATAT CCATAGCAAA TTAGAAAAGC 
GTAGGAGCTC TCTTGCCTCT ATTTGATACA 
GGGGCTGCCG CTACGTACTT TAGAATTAAC 
TAAAGCCCTA TGCGTTTTTG ACCATGAAGC 
GTTGACGAGA GTTCTCTACC AGATATGGAT 
TCGGGAAGTT GCTGCAAAAG AAAATTCTAA 
GTATAGAAGA AGAAATCGTT TCCTTACCTA 
CTTATGAGCT TAGGTCTTGA AAAATCAGGA 
TGCGTATGAC ACTTTAGGAC TGATTTCTTA 
AATCTCGTGC ATGGACAGTG GTTCGAGGGT 
GGAGAAATCC ATACGGATAT TCAAAAGGGC 
TACTTTTGAA GATATGATAG AGTGTCAAGG 
TAGGGAAATT ACATATAGAA GGACGTGATT 
ACTATGCTGT TCCTTCATAA TTAAAGGAAC 
GCATCCAAAA TGTCTTTTTC AATTGCTCGT 
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241601 
241651 
241701 
241751 
241801 
241851 
241901 
241951 
242001 
242051 
242101 
242151 
242201 
242251 
242301 
242351 
242401 
242451 
242501 
242551 
242601 
242651 
242701 
242751 
242801 
242B51 
242901 



ATTAGAGTTT 
CGGGATAATG 
TATGCGCCTC 
AAATTCATAA 
ATAAACTCCT 
CGAATCCTAG 
ATGGCATAGG 
GGACAGAAAC 
GGTAAGGAGG 
TCGAGAGCCT 
ACCTAAGATG 
ATTCTTCTGC 
AACCAGTCTA 
ATTGATGAGT 
CAAAGGTAAT 
AGATTGCTAT 
ATCTACAGAA 
CGTAGGTAGG 
AATACATTCA 
CAAGATAAGC 
CGAATATAAG 
CTCATATTTC 
CTATAGATAA 
TTTTTAGCGG 
CTCGGCAGCT 
CTACAACTTT 
GCAATTGCTT 



CTTTTGATTG 
CTCACTTCTT 
TGCATATAAA 
CACCCTGACA 
AGGGGAATTA 
AGAACCTCCT 
GATGACCCAA 
TGGCGGATTG 
AATCTTGATG 
CGGTATTGCT 
AGGCGTTTGC 
CGATTGATTC 
TGGGAAACGT 
TTCGTGTGAT 
AACTCCACTG 
GCCCTAGATG 
AACGAAGACG 
GAGAAATATC 
TCTATAGAAA 
TCGAGAGCCT 
TTCCTTTGCT 
GTAATCTGCA 
ACCTTTTCTA 
AAAACATGGG 
GATAATACTT 
GCCGTCGCAA 
CGTATTCCTT 



AAACTTTTTT 
TGCCGTATAG 
GACTCTCTTC 
GGTAGTGCTA 
AACTTTCTTC 
ATTCCGGAGC 
AAAACGATGA 
CTTTGCTGGA 
ACCTCTATAC 
TTGCTGTTCT 
ATTTCAAGTT 
GCAAAATTTA 
TTGCAATAAT 
TTAAAGAAAG 
GATCCAGAAT 
ACATCCGTCG 
TTAAACTATA 
GAAATCGGGG 
AACGGCCACT 
AACATCGTGC 
ACAAGAGACT 
AGTGAACTTG 
GCATATTCAT 
AGGCAGTTGC 
CTTCGAGACT 
TCATAAGAAT 
GTCTTCAAAA 
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TCTTCTCTAA 
ATTTTCCGCA 
CAAAAGTAGG 
TCATAACGTA 
TCTAGGAAGA 
CCTCGGTTAT 
GCACATTCAA 
GACAACTATG 
CTAACGGCTT 
TTCCCTATGC 
ACGATGTAAC 
AATCAAAAGT 
TGGAGGCGCT 
TACCGTTTGA 
AGGAAGTAAG 
AAAAAACCTA 
GGCTATTTCC 
TGGTCTAATA 
GCGTAAACGG 
CAAGCTCATG 
ACAAAATGCA 
AACTGTAGAJ^ 
ACAGCTTTTT 
TGGATCTCTC 
AGGAATCTTC 
CGGTAGTTGT 
AGTAAAATAT 



GAAATTTTCT 
AAGGAAAAGA 
GGCAGTTCCT 
TTTCACAAGC 
TTTATAGTGG 
TTTTCCAGAA 
GATTCCCTGC 
TTATCCATAC 
GCCTATAGTA 
AAGAATCATA 
AAAGTAAGAA 
AAGGACACCT 
CTTCTTTTGT 
GGATGAGAAT 
AATAGATAAA 
CAGTTACAGA 
ATGGGCATCT 
GATTCCCATC 
CGTAGCTGCT 
AGCAATGCTG 
ATAAAGGGTA 
TGGTGACGTT 
CCCTTGGACT 
CTTGGAAATA 
TTAGATCTTC 
CCCTAAATGG 
CAGAAAGTCT 
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242951 


AGTAAATTTA 
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243001 


CTAAACj 1 i L-C 
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243301 
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TTAAAGCTTC 


TAAAGCCTCT 


243351 


TCCTTGGTAT 
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P ATTAP AP AT 


APATAAACAC 


GTGCAGAGTG 


243401 


CAAATCCTTA 
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PPTPATPPAA 


AGATTAGAAA 


243451 


TCTTGGGATG 
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243501 


AATAAAGCAT 
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TTTTPTPTP A 
i X X x^xoxv-rt 


TACAGTACTT 


243551 


CAAGTTATAG 
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TP ATAPATAA 
i A X AvjH 1 A/\ 


PTTPATAAPA 


TTGTAGGACA 


243 601 


TCACCTATTT 


GAGG i 1 GG i G 


PT ATPPTTPT 
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243651 
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2437 01 
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243751 
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PA APPHTAAA 


TAGATCCTAC 


243801 
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243851 
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TAAGCTTAAG 


AGTCTTTTTA 
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244201 
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ACGCTGTTGT 


CCTGCGGATC 


TAGCTTCAAT 


244251 


AATGTCTCTA 


GCCGTTTTCT 


CGTTTTTCAC 




GGATCGCCAG 
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244301 CTTTAGGAAT GTCCGATAGA CCTGTGATCA ACACAGGAAT AGATGGCCCA 

2443 51 GCTTCTTTCA TCAATTCATT ATGTTCGTTA TGCATAGTTT TCACTTTGCC 

2444 01 ATAACAATCA TTGAAGACGA GAGCTTCGCC CAGTTTTAAG CTTCCATTTT 
2444 51 GAATCAAAAC AGTCGCAACA GGTCCGAGAC CCTTGTGCAG TTCTGATTCA 
244 501 ATAACAAGTC CTCGAGCACG TGCTGAAGGA TCGGCTTTTA GCTCCAAGAC 
244 551 TTCAGCTTGT AAAGCTAACA TCTCTAAAAG TTCTGAAAGA CCTTCTCCTG 
244601 TTTTTGCGGA GGTATTTACT GTAACAGTCG AGCCTCCCCA AGCTTCTGGC 
244 651 AATAGATTGA TTTCAGAAAG TTGTCTATAG ATGGTTTCGG AATTAAAATT 
24 47 01 AGGCTTATCA CACTTGTTGA TAGCTACAAC AATAGCGATA TCAGCAGCTT 
24 4751 TTGCATGTTC AATAGCCTCT AAAGTTTGTT CTTTAATTCC TTCGTCTCCA 
244801 GCGACTACAA GCACAACAAT ATCACAAACT TCAGCTCCAC GGGCTCGCAT 
244851 TGCAGAGAAA GCTTCGTGAC CAGGAGTATC TAAAATTGTT ATGTCTCCCA 
244901 CTGGGGTGGA GCAGCAGAAG GCTCCCATGT GTTGGGTAAT CGCTCCAGCT 
244951 TCTGTTGCAG CGACATTACT TTTCCTTAAG GAGTCAATGA GTGTTGTTTT 
24 5001 TCCGTGGTCG ACGTGACCCA TAAACGCAAC AATAGGGGAG CGAATCACAA 
24 5051 GCTTGCTGGG ATCTGTAGAT TGAATTTCGT CTCTTACAGT GTCATTGCTT 
24 5101 AGGCACAACT TATCTTGCTC AGAATAGTCG ATGTCAATTG TACATCCAAA 
24 5151 CTCTAAGCCA ATAAATTGTA CTGCAGTTTC GCTGTCTAGA ATATCATTGA 
24 5201 CTACATAGGT CATTCCATGA ATGAATAACT TTTGAATGAC TTCTGAAGCC 
24 5251 TTGAGCTTCA TTTCTGCTGC CAGATCTTTG ACGGTAATTG GCAAGGAAAT 
24 5301 TTTGATATGC GTAGGTCGCT GGATAGAGGC TTCGTCATAG TGTTTTTTAG 
24 53 51 GCTTATAAAC ACGTTTTTTT CGCCATCTGT CTTCTTCTCC GCCTTCATTT 
24 5401 AATCCGTAAC GATCTCTTCC TGTAAAAGCC TTTAGGCTTT CATCAGATTT 
24 5451 CTTAGAACGA TCACGAAAGT CGGTAAGATT TTTCTTCCCA GCATCTCTTT 
24 5501 TTGGACCAGA AGO AGG ACTA CGGTTAGCAG GATTGAATTG TTTTTCTCGA 
24 5551 TTATTCTGTT CACCACCTTC AGATGTTCCA GGTTTCCCTG TTTTATCTGA 
245601 AGCAACGGGC TTTGTGCTTT TCGAGCCAGC TACGACTTTC TCTTCCTTGG 
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24 5651 CAGGAGCCTT GAATGTTTTT GCTAGGAGAT GATTGATATG CTTTCCTGTA 

245701 GGGCCGAACT TAGATTTAAT CATTACGACG CTTTTAGGTT CAGCAGGCTT 

24 5751 CACAGGCTTA GGCTCTAATT CTTTTTCTTG AGGTGGGGTT TCGGGCAATA 

24 5801 CGGGTTGCTC AGGAAGAACT TCAGCAACTG GATGAACCTC AGGACTTTCG 

24 5851 TCACAAACCT CATCGACTAC TTCTAACTCA GGCTCAGGAT CTGCTATGGA 

24 5901 GACTGGAGCA GGTTCAGATG TATCCACTGG AATATGAGCA GAAGACTCTT 

245951 CTTCGGATGA TGAGAACGAC GAACGATTTT TAGCACGAAT GCGACGTGAA 

246001 GTAGACTCTG GTGAAGCTTG TTCCGCACTT GCCGTAGGGG TAGAAGTTGC 

24 6051 GGCAAGAGCT ACTTTTACAG ACTTTTCTTT CGCAGAAGGT TTTTCTGAAG 

24 6101 AAGATTTAGC TTCAGAAGAT CCTGCTTGGG CAAGTTTTTG TTTTAGCTTA 

24 6151 TCCAGCCCCG CGGCTTTCGT TAATTGAGCA TTCTTAATCT TCAATTTCAG 

24 6201 GTTTTTCGTC AACTTTACTT TCTCCATATT TGCTGACTTG CTCAAGGATC 

24 6251 TTATAAGCAA GCTCTAAACT GATCCCAGGA ACAGATGCCA GATCATTAGC 

24 6301 ACTCGCTAAT AATACTCTTC TAATTGTGTC ATATCCAGCA TGTTCTAAAT 

24 6351 TTTGGATGAC TAGCTTACTA ATCCCTTCCA TTTCTAAGGG TTGATCTAAG 

246401 TGCGGACTAT CGAATTCTGC TAATTGAAGG CGTTGAATTT CTAGCAACTT 

24 6451 ATTGTACTCA CTCATACGTT GTACTTCGAG CTCGTAGTCT AGAATGTGGC 

24 6501 TAATTAAACG AGCGTTAATT CCTCGTTTAC CAATAACAGT AGCGTAGTCT 

24 6551 GCATCATTAA CGACAATTGC AATCACTTTG TCGTCTTCTA AAATAGCAAT 

24 6601 CTTTTGGATT TCTATTGGAT AAAGAAGATT CTGTAATAAC TCTGTAGAGA 

246651 CGGGGGAGTA ATTGACAATG TCAATTTTCT CATCGTTCAA TTCTCGAATG 

24 6701 ATATTTTTTA CTCGAGAACC TCGCATACCT ACAAAAGCTC CAACAGGATC 

246751 AGTTTTAGGG TCTGACGATC TTACAGCTAG TTTCGTGCGG TACCCAGCTT 

246801 CACGAGCTAT CTTAACAATC TCCACAGAAC CTTCTTCTAG TTCTGGGACT 

24 6851 TCTTGAATAA ATAATTGTTT AACAAATTCT GCGTGACTAC GACTGAGGAT 

24 6901 AACTTCCGCT CCACCATTTT CAGACTCTTG AACTTCATAG AGTAGGGCGT 

24 6951 AAATTTTATC ACCGATCTTA TGTTTTTCTG TTTTAGGATA AAACCGGGTA 
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247001 GGAAGAATTG CTTCAACTTT TCCTAAGTCA ATAATTAAAT TAGAACCTTT 

247051 AGCAAAACGT TTGACAACAC CAGATAAAGT TTCATTTACG CGATGGCGAT 

247101 ATTCTTCATA AATAACGTCT CTCTCAGCAT GTCTTAGCTT TTGACCGATA 

247151 ATTTGTCGTG CTGCGTGAGC AGCTATCTTC CAAAATTATC AGAAACAAAA 

247201 GGGACATCCA TGTACTGACC AATCTGACAG TCCGGATCGT ATTCTCTGGC 

247251 TTTATCTAAA GGAATTTCTT TGCTAGGATT CTGACAAATT TCTACTATTT 

247 301 CCTTTTCACA AAAGACTTCG ATGTCACCAG TACGAGAATT AATGTTTACA 

247 351 GATATGTTCG CGTCATCTCT TAAGGTTTTT TTAGCAGCAA TTTTTAAAGC 

2474 01 AGATTCGATA GCTCCTATAA TAGTAGAGCG CTGAATCCCT TTTTCTTTCT 

247 4 51 CCATGTAGTC AAAAATAGCT ACAAGATTTT TATTCATTAT ACTCCTCTTG 

247 501 TAAATAAGGA AATCAGTAAA TAGCTAAATT AAGGACGTAT TATTTAGAAC 

247 551 TAAATAATAC GATCCTCTGC ATTACCAGCA TTAGATGCTA TTTTCCTTTT 

247 601 TTCTTTCTCT CTTTAGGGCC TTGAGAATCC TTGAAATCTA ATTCAGTCCT 

247 651 AGAGTCTTGA TCATAAGCAT TGTCAGCTAA GTATTCTTTT ACAGAAAGAG 

2477 01 AAACTTTTTT ATGATCTGGA TCTAGCTTAA TTACTTTTGC AGAAACATTT 

247751 TCTCCAATGG AGATAATATC TTCAATTTTT GCAAAGGGCT TGTCAGAAAG 

247 801 TTCTGAAACG TGAATCAATC CTTCAATCCC GTTTTGTAGC TCAACAAAGG 

247851 CTCCAAATGC AGTGATTTTA GTCACAACTC CTGAAATTAC TGTGCCAGCA 

247 901 GGGAACATAG CTTCAATTTC ATTCCAAGGA TTAGAACTTA ATTGCTTAAC 

247951 TCCTAAAGTA ATTTTTTTAC TTTCTTTGTC TACTGATAAA ATAACAGCCT 

24 8001 CTACAGAATT TCCTTTTTTG AATAGTTCTG AAGGGTGAGA GACTTTTTTA 

24 8051 ATCCAACTCA TGTCAGAAAT ATGAATCAGA CCCTCAATTC CTGGTTCTAA 

248101 TTCAACGAAA GCACCGTAAT TGGTTAAGTT CTTGATTTCA GCATTGACAT 

248151 GGAGACCTAT AGGATATTTT TCTTCGATAT TGTCCCAAGG ATTACGTTCT 

248201 GTTTGCTTTA ATCCTAGAGA AATTTTTCCT TCGTCCTTCT GAATAGATAG 

24 8251 AACAATGGCT TCAACTTCAT CGCCTTTATT TACGACTTCA CTAGGATCTA 

24 8301 CAATATTTTT CACCCAAGAC ATTTCAGAAA TGTGAATTAG ACCTTCAATG 
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24 8351 CCCTCTTCAA TTTCAATGAA 

24 8401 ACCAAGAACT CGTTTTCCAG 

2484 51 GATTATGCTC TTTTTGTTTG 

248501 TCTACGCTTA AAATAATTAC 

24 8551 GGAAGGATGT CGTATGCGCT 

24 8601 CGTCAATACC ATCGAGATCT 

24 8651 ACAACTCCTT TGCGGTATTC 

24 8701 TTTCTTAGAG ATTCTCTCAG 

248751 TATTGCGACG TTCAACGTTA 

24 8801 CCGACATAAT CATCTAAATT 

24 8851 AGGTAGGAAG GCTTCCATTC 

24 8901 TACGTGTAAT TTGACCTTTA 

24 8951 ATGTATTCCC ATTGACGTTG 

249001 TTTGCCCTCT TCGTCTTCGG 

249051 CAAGCACTAA ACCTTCTGAA 

24 9101 CCCTCAGACT TCAGACCAAC 

249151 AACTACGGTA CCTTTTAGGA 

249201 CTTCTTCGCT CGAAGTAATT 

24 92 51 TCGGCAACGT CTTCTGTGAG 

2493 01 TCCCC AAGTA TATTCAGCTT 

2493 51 ACTACAAAGT CAAAAAAGAC 
249401 ACCTCCCTTG CAGGAGACGA 

2494 51 ATCCTATACT 'TCCiGTTCTAG 
249501 TATATTACCA GATTTTATAA 
24 9551 TTGTCATATG TAGAGAAACA 
249601 TTTCATAAGC ACCCCATAAA 
24 9651 GTTCTTAAGC CTGATAGATC 



AGCTCCGTAG GGGAGAAGCT TCACAATTTT 
GAGGGTATTT CTTCTCAATA TCTTCCCAAG 
AGACCTAGAG CAACTCGTCC TTTTTCTTTA 
TTCCAACTCT TGATTCAATT CGACCATTTC 
TCCAGGTCAT ATCGGTAATG TGGAGAAGAC 
AAGAATACAC CAAAGTCAGT AATGTTTTTA 
TCCGATAGAA ATTTGTTCAA TAAGTTCGGC 
CTTCTAAGAG TTCTCTTCTT GAGACAACAA 
ATTTTTAAAA TTTTGAATTC ACAAACTTTT 
TTTGATTTTC TTGTTGTCAA TTTGTGATCC 
CAATATCTAC AATAAGGCCG CCTTTGACTT 
ACAATAGAAC CTTCTTCACA ATGAGCTAAG 
TCGTGTGGCT TTTTCTCTAG AAAGGACAAC 
CTTGGTCGAG ATAGACTTCT ACTTCAGCTC 
GAGTCTATGA ACTCTGACAT AGGGATCACT 
ATCAACTACG ACAAAGTCTT TATTAATATC 
TGGCGCCAGG CTGTATTTCG TTATCAGATT 
CTGTGTGCCG TATAAAGCAA ATCTTTAAAT 
GCATTCTATA TTGTCCAGAA TTTTTTTAGA 
GTTTTGGCAT TTATTGTTAT TCTCCTAAAA 
AGTGTAAATA TAAACCTTAA AAAAGGCAAG 
TTAGTTCTAT ATAATCGAAT TAATCGTAAA 
AGATGTATAA GGCCTCTTTT TTAACTTTTC 
AATACGTGTG GTAATTATCA TAAATGTGCC 
GGATTTGGCG CATAGAAAAC ATATCGTTTT 
TTCTCGfCCC AGATGGGCTA CATCAGCTTT 
TTTTTGGTTG CTGCTCAAGA ACCAAGTGCA 
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24 9701 TTCAGAAGTA GAACCATACT 

24 9751 GGTTTCTTAG CTCTAAGAAA 

249801 GTTATTACTG GAAGATGAAG 

24 9851 CCGTATAAAT TAAAACTAGA 

249901 TAGAGAAACT GACGCATAGA 

249951 ATTAAGTGTC TTTTTTCTAG 

250001 AGTTAGATAA GG AAATTATA 

250051 TTCAGGTCCA TCTGGATATA 

250101 TGCATCCTCT TTTATTTGAG 

250151 CTTATGACTA CAACAGAAGT 

250201 TCTTGGGCCA AAACTTATGA 

2 50251 GGACCAAGAC ACTAGCTCAA 

2 50301 CCTTTTATTT TGAAATCAAA 

250351 CATAGCTACA GGAGCTTCTG 

250401 ACGATGAATT TTGGCAAAAA 

2 504 51 GCTTCTCCTA TTTTTAAAAA 

2 50501 TTCTGCTTTA GAAGAAGCTC 

250551 ATGTAGTTCA TCGTAGAGAT 

250601 CGGGCGCAAA ACAATGAAAA 

250651 AAAAATTTCT GGAGATAGCA 

250701 AGACTCAAGA AATTACAACT 

2507 51 GGCCATAAGC CAAATACGGA 

250801 GTCGGGCTAT ATTGTGACTG 

250851 GAGTATTTGC TGCTGGAGAT 

250901 ACTTCTGCAG GAAGTGGTTG 

250951 AGGCTAATGC GATTGCAGTT 

251001 GAGAGAATGA CTTTAGAAAT 



TATTGAGGGT TAAATAATAG GATGGGTCTT 
ATTCTAAAGA TTAATCACAG AAATGTAGGT 
ATCTTGAGAT TTGAAACTTT AGTCTTAAAT 
CAAAAGCTAT TCAAAAGCAA GTTTTTTGCA 
CGCTAAGTGT TGTTTATTAG ATAGGCTAGA 
TAGATCAAGA ACAGTGTTCT CAATTTCTAA 
AATGATTCAT TCCCGGTTAA TTATTATTGG 
CAGCGGCAAT TTATGCATCA AGAGCGCTTT 
GGGTTTTTCT CTGGGATCTC TGGTGGCCAG 
TGAGAATTTT CCAGGGTTTC CTGAAGGGAT 
ATAATATGAA GGAGCAGGCT GTGCGGTTTG 
GATATTATTT CCGTAGATTT TTCTGTTCGC 
AGAAGAAACC TATTCTTGTG ATGCCTGTAT 
CTAAACGTTT AGAAATTCCT GGAGCAGGAA 
GGAGTGACTG CTTGTGCCGT TTGCGATGGG 
TAAAGATCTT TATGTGATTG GGGGAGGGGA 
TTTACCTGAC TCGTTATGGA AGCCACGTAT 
AAACTGCGGG CTTCTAAAGC TATGGAAGCT 
AATTACATTT TTATGGAATA GCGAGATTGT 
TTGTCCGTTC CGTAGATATT AAGAATGTTC 
AGAGAAGCTG CGGGGGTGTT CTTTGCTATA 
TTTTCTCGGA GGACAGCTGA CGTTAGATGA 
AGAAAGGAAC GTCCAAGACT TCTGTCCCTG 
GTTCAGGATA AGTACTATCG TCAGGCGGTT 
TATAGCAGCA CTAGATGCTG AAAGATTCTT 
GCTGTGGCAT ACTCTTTGCA GTGGCTTATA 
TCCAATTTTT GCATAGACAT GCGAAGGGAG 
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251051 GAGAACTTCG GGTCCGTGAG ATACTTTAAA GACTTCGATG TCTTTCCAGG 

251101 CAACAACGCT CCCTATGCCA GTTCCTAAAG CTTTTGCTAC AGCTTCTTTT 

251151 CCAGCAAAGC GACCTGCAAA TGAAGGGATG GGATCGGTCT TTTCTAAGCA 

251201 ATATTTCTGT TCTGCTTCTG TAAAGATTCT ATTGAGTAGT CGATTGCCGT 

251251 GAGTTGCAAT TGCCTCGCGA .ATGCGGCTAA TTTCAATAAT ATCGGTTCCT 

2513 01 ATATGAATGA TTTCCATAGA GTTCGCACTA ATTTCCTTCA GGATTTTCCA 

251351 TTAGGGCATT CTGTAATTTT GAGAAGTAAG CGATGGCAGT TTTAATGAGA 

251401 GCATAGCCCA CTGTAGCTCG AATCACGAAT AAGGCTTGAG GATTGCTGAT 

251451 ATAACGACGA AGGACTAAAG AGAAGGCAAT CATAATGACC ATGATGAGAA 

251501 ACCTTCTAGA GATAAAGGTC TGCTTTATAT AGGCAATCCA TGTAATTTTT 

251551 TGTGAAAAGA ATTCAGAAGA TAGACTCAGC TGTTTCCTTA TAGTTTTGCT 

251601 TAGCAAGTAG CGATGTTTTA GGGATGCTAA TCCCCAAGAA ACCATAGATA 

251651 GAAGACAGTA GGTAAAAAAA GAAGAAAGGG GGTCTAGGGC TATTGCGGTA 

2517 01 GCTTTTAGCA AGAGTTTCAT TCCTGCTGAT ATCCACAAAA TACCAGGAAA 

2517 51 TAGTATCAGG AAATATTTGA TGTTTCTTGC CATAGTGCAT TACACTTAAT 

251801 TTCTAAATAC TCTATTTTAC TTGGAGTATG AATTTTTTGC TAACTGGTAT 

251851 TAAGATATGC GAGCTGAGAT GGCTGTGATC TATTGGGATC GCTCAAAAAT 

2 51901 TGTCTGGTCT TTCGAGCCAT GGTCTCTAAG ACTTACTTGG TATGGCGTCT 

251951 TTTTTACTGT AGGGATTTTT CTAGCATGTC TCTCAGCAAG GTATTTGGCT 

252001 CTTTCCTATT ATGGTTTGAA AGATCATTTA AGTTTTTCCA AAAGCCAGCT 

252051 ACGCGTGGCT TTAGAAAACT TTTTTATATA CTCTATTTTA TTTATTGTCC 

252101 CTGGAGCTAG ACTTGCCTAT GTGATTTTTT ATGGATGGAG TTTTTACTTA 

252151 CAACATCCTG AAGAGATCAT TCAAATATGG CACGGAGGCT TGTCGAGTCA 

252201 TGGAGGCGTT CTTGGCTTTC TTTTGTGGGC GGCCATTTTT TCTTGGATAT 

252251 ATAAAAAAAA GATTTCAAAA TTGACTTTTC TCTTCCTTAC AGACTTGTGT 

252301 GGATCAGTTT TCGGAATTGC AGCGTTTTTT ATTCGTTTGG GTAATTTTTG 

252351 GAATCAAGAA ATTGTAGGAA CACCGACTTC TTTGCCTTGG GGGGTGGTTT 
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2524 01 TTTCTGATCC TATGCAAGGT GTCCAGGGAG TTCCTGTGCA TCCTGTGCAG 

2524 51 CTTTATGAGG GAATCAGTTA CTTGGTCGTC TCTGGAATTT TATATTTTCT 

252501 TTCCTATAAG CGTTATTTGC ATTTAGGTAA GGGATATGTG ACTTCTATAG 

252551 CCTGTATTTC TGTCGCTTTC ATTCGTTTTT TTGCGGAGTA TGTAAAAAGC 

2 52601 CATCAAGGGA AAGTTTTAGC AGAGGATTGT CTACTTACAA TTGGTCAAAT 

252651 TTTATCTATA CCTTTATTTC TATTTGGTGT GGCCTTACTT ATCATTTGCT 

252701 CATTGAAAGC TCGAAGGCAC CGTTCACACA TATAGCAGCA TCTTTTTGGC 

2527 51 ATGAGACTTT AGAATTTTAC AGTCTATCAC AATAAGAGAC CTGTATTGAG 

252801 AGATTCTCCT TTGGGAGATA ATGTGGGTTT ATGATATTTT GACCTCTTAA 

252851 GTTTAATTTT TAGAGTTTAT CAAATGAATA AACGCACTTT GCTTTTTGTT 

2 52901 TCTTTAATTG GGATTGCTTT TGTAGGATGT CAAATATTTT TTGGTTATAA 

252951 TGAATTTCGT TCCTGCAAAA ATCTAGCAGA GAAACAAAGA AAGATTTCAG 

2 53 001 AACAGACGCT AGCTGCAGTA GAATCTGTAG GGTTAAGTGT AGCTTCATGG 

253051 GACACCGATG TAAACGGAGA AGAACATAAG AATAACTACG CAGTTCGTGT 

253101 TGGAGACAAG TTATTTTTAT TACATAATGG AGAAGCTGCT CAGTCTGTTT 

253151 ATTCTTCTGG GGAATCTTGG AGCTTTGTAG ATCACAAGTG TGGTTTCGAt 

253201 AATATTCACT TGGCTTTATA TCGTCAGCAG GGTTCTTCTT TCAATCCTAC 

253251 GAATACGGGA AAAGTTTTTC TTCCTACGAA TCATGAAGGT TTACCTGTAC 

253 301 TAGTTGTTGA GTTTCGTAAC AATAAAGAGC CTTTAGTATT TCTAGGTGAG 

2533 51 TACGCACAAG GAAGAATTTC CAATAAAGAT AGCACGATCT TTGGTACAGC 

2534 01 GCTTGTCTTT TGGAGATCAG GAAGCGACTA TATTCCTTTA GGTCTCTATG 
2534 51 ATTCTCGAGA AGAAAAGTTA GTTTCTTTGG ATCTTCCTAT TACACGAGCT 
253501 GTAATTTTTG GTAATGACCA AGATTCGGCA AAGTCGTCAG ATACTGCGAA 
253551 CCACTATGTT TTATTTAATG ATTACATGCA GATTATTGTT TCTGAAGAGA 
253601 - GTGGTTCTAT AGAAGGTATC AATTTACCTT TTGCTTCAAC AAATAATAAA 
253 651 AGCATTGTGA ATGAAATTGG TTTTGATAGG GATTTAGCTT CAGAGAAATC 
253701 TCCTGAAGCT CTTTTCCCTG GGCTGTCTTC AAAACTTCCT GATGGCCAAC 
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253751 AAGCCAAAAA CTCGATTGGA GGTTACTACC CTTTATTGCG CAGGGGATTA 

253801 TTAAGTGATT CTAAGAAATT ACTTCCTCTA GAGTATCACG CATTAAATGT 

253851 GGTTTCAGGA AGAGAGCTAG CGACTCCTGT GGCTTTAAGA TACCGAGTTC 

2 53901 TTTCCTATAC CCCCCATTCC ATTCAATTGG AAAGCTTAGA TAGATCGGTT 

253951 CAGAAGGTAT ACAAACTTCC AGAGAATCCG GAAGAAAAGC CCTATGTTTT 

2 54 001 TGAAACTGCA ATTACTTTAA CGAAAGAAAC CGAAGATGTA TGGGTAACTT 

254051 CAGGAGTTCC TGAAGTGGAG ATCATGTCAA ATGCTTCAGC CCCAACCATT 

254101 AAATACAGGG TTATCAAAAA AAATAAGGGG TCTTTAGATA AAGTTAAGCT 

254151 TCCAAAAGTA AAAGAGCCTT TAGCTATACG TCGTGGTGTT TATCCTCAAT 

254201 GGATTTTAAA TTCGAATGGA TATTTCGGTA TTATTTTAAC TCCGTTGTCT 

254251 GAAATTGCTT CTGGCTATGG ATCTCTCTAC ATTTCGGGTT CTACGGCTCC 

254 3 01 GACAAGATTG TCTGCTATTT CTCCTAAAAA TCAACTGTAT CCAGTATCAA 

2 54351 AATATCCTGG ATATGAGACC TTGCTTCCTT TGCCAAAAGA TGCAGGGACA 

2 54401 CATCGATTTT TAGTGTATGC AGGTCCCTTG GCAGAGCCTA CACTTAAAGT 

2544 51 ATTAGATAAG ACAATTACTC AGGAGAAGGG AGAAAATCCT GAGTATCTTG 

254 501 ATAGCATTTC TTTCCGTGGT GTTTTTGCAT TTATTACAGC TCCTTTTGCA 

254 551 GCACTCCTAT TTATTATTAT GAAGTTCTTC AAATTGGTTA CGGGTTCTTG 

254 601 GGGAATTTCC ATTATTTTAC TTACTGTATT TTTGAAATTG CTTCTCTATC 

254 651 CTTTAAATGC ATGGTCCATA CGATCTATGA GGCGTATGCA GATTTTATCT 

2547 01 CCTTATATTC AGCAAATTCA GCAAAAGTAT AAGAACGAAC CTAAGCGTGC 

2547 51 TCAGATGGAA ATCATGGGCT TGTATAAGAC AAACAAAGTG AATCCTATCA 

254 801 CGGGTTGTTT ACCTTTATTG ATACAGCTTC CTTTCCTAAT TGCGATGTTT 

254 851 GATTTATTAA AGTCATCATT CTTATTACGA GGAGCCTCGT TTATTCCTGG 

254 901 GTGGATTGAT AACTTAACAG CTCCTGATGT GTTGTTTTCT TGGCAGACAT 

254 951 CGATATGGTT TATTGGAAAT GAGTTCCACT TACTTCCTAT TCTATTAGGT 

255001 ATAGTGATGT TCTTACAACA GAAGGTCACG AGTTTGCATA AGAAAGGACC 

255051 TGTTACGGAT CAGCAGAAAC AGCAACAAGT TATGGGGAAC ATGATGGCGA 
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255101 TTTTATTTAC CGCTATGTTC TATAACTTCC CTTCAGGATT AAACATCTAT 

255151 TGGCTTTCGT CTATGATTTT AGGAGTCGTC CAGCAGTGGA TCACTAATAA 

255201 GATATTAGAT AGCAAACATC TTAAAAATGA AGTGGTTTTA AATAATAAAA 

255251 AACATCGATA ATACTGAAAA AAAAGTCGAA TTGTCTTAAT ACAGGCTTAT 

2553 01 TGAAACAAGC GATTTATATG AGCTCATAGT TTATGCGAGC ATGGGAAGAA 

255351 TTTCTTTTGC TACAAGAGAA AGAAATTGGC ACAAATACTG TAGACAAGTG 

255401 GTTGCGATCT TTAAAGGTCT TATGTTTTGA TGCTTGTAAT TTGTATCTTG 

2 55451 AAGCTCAAGA TTCTTTTCAA ATTACTTGGT TTGAGGAGCA TATAAGACAT 

255501 AAGGTTAAAT CTGGTCTTGT AAATAATAAC AATAAGCCCA TTCGTGTTCA 

255551 CGTTACTTCG GTAGATAAAG CAGCTCCTTT TTATAAGGAG AAGCAGATGC 

2 55601 AGCAAGAGAA GACAGCATAC TTTACCATGC ATTATGGAAG TGTGAATCCT 

2 55651 GAGATGACCT TCTCTAATTT TTTAGTTACC CCTGAAAATG ATCTTCCTTT 

255701 TCGTGTTTTA CAGGAATTTA CTAAGAGTCC TGATGAAAAC GGAGGAGTTA 

2 55751 CTTTTAATCC AATTTATCTG TTTGGACCTG AGGGATCTGG AAAAACTCAC 

255801 TTAATGCAGT CAGCTATCAG TGTTCTTCGT GAATCTGGAG GTAAGATTCT 

255851 CTATGTTTCT TCGGATTTGT TTACAGAGCA CTTAGTCTCT GCTATCCGTT 

255901 CAGGAGAAAT GCAAAAATTC CGTTCTTTTT ACCGCAATAT TGATGCTCTA 

255951 TTCATTGAGG ATATCGAGGT TTTTTCAGGA AAGTCGGCAA CTCAAGAAGA 

256001 GTTCTTCCAT ACGTTTAATT CTCTTCATTC TGAAGGGAAG TTGATTGTAG 

256051 TGTCTTCATC CTATGCGCCT GTGGATCTCG TTGCTGTTGA AGATAGATTG 

256101 ATCAGCAGGT TTGAATGGGG AGTTGCAATT CCGATACATC CTTTGGTTCA 

256151 GGAAGGATTG CGCAGTTTCT TAATGAGACA GGTAGAGCGC TTATCTATTC 

256201 GCATTCAAGA AACGGCCTTA GATTTTTTAA TTTATGCGCT ATCTTCCAAC 

256251 GTAAAGACCT TACTGCATGC ACTGAATCTT TTAGCAAAGA GGGTAATGTA 

256301 TAAAAAACTC TCTCACCAAT TACTATATGA AGATGATGTG AAAACTCTTT 

256351 TAAAAGATGT TTTAGAAGCA GCAGGAAGCG TTCGTTTAAC TCCTTTAAAG 

2564 01 ATTATTCGTA ATGTTGCTCA ATATTATGGG GTCTCTCAGG AGAGTATTTT 
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2564 51 AGGACGTTCT CAGTCCCGAG AATATGTATT GCCACGTCAG GTAGCCATGT 

256501 ACTTTTGTCG TCAGAAGCTT TCACTATCAT ACGTGAGAAT AGGCGATGTC 

256551 TTTTCAAGAG ATCATTCGAC GGTAATCTCA TCCATACGAT TGATTGAACA 

256601 AAAAATAGAA GAAAATAGCC ATGATATTCA CATGGCTATT CAAGATATTT 

256651 CTAAGAATTT AAATTCCTTG CATAAGAGTT TGGAATTTTT CCCTAGCGAA 

256701 GAGATGATTA TTTAGAGGAG TAACGATCCT CTAATTTTTG CTTCTGGACA 

2 56751 GTTGCAGCAG CCGGCGCAGA TATCATTTTC ATTGAGTTCA ACTCCGCTGT 

256801 TCTAGCTTGT TTAAATTGTC TAATATGATT GAAAGCAAGG ACAAGAGCAA 

256851 TACCAGCAAG TACTATCTGT ATAGCAAGAA CAATGCCGAT TCCTAAGGAT 

2 56901 AGGGTGACTC CACTGGCAAT AACCAGTCCT AGGGTGGTTA AGGAAGCAAG 

2 56951 GAGGAGACCT AGACTAACTA AAATAGGCTT AAAAAATACC CAGGAACAAT 

2 57 001 AACGATGTGT TGCCTTGTGA GATACTGATG GTTGTGGTTG TGTAGTCTGA 

257 051 GGTGTTTGTG CTACTGTAGC CATGTTGTTC TCCTGAGTAA AAAATTAAAT 

257101 AATTTTAATC TGTCTATATA ATTGTTGTCA ATAATATTTT TATAGATATT 

257151 ATTTTAATTT TATTAAAATT TATCAACTAA TTTTATTTAT AGTCTTTCTC 

257201 TAGTTCAGAA CTATTAAGCT CTCTTAATAG GCTCTCTTTT ATTTACCAAG 

2 57251 GCGAAGTTCT rpTTTTTTTTT TCAAATTTCG CATATGGAGG GTTCTAGATG 

2573 01 GTATTTGAAA TGGTTGCATT GTGGAAGATT TTTCGAGTTT TGATAAGAAC 
257351 AAAGTCAGTG TTGACTCTAT GAAACGGGCG ATTTTAGATC GTCTGTATTT 
257401 AAGTGTTGTA CAATCACCAG AGTCCGCATC TCCTAGAGAT ATCTTCACAG 

2574 51 CTGTTGCAAA AACTGTTATG GAATGGTTGG CCAAGGGGTG GCTGAAAACT 
257501 CAAAATGGCT ACTATAAAAA TGATGTAAAA AGAGTTTATT ACCTTTCCAT 
257551 GG AATTTCTC TTAGGGAGAA GTCTAAAAAG CAATCTTTTG AATTTAGGAA 
257 601 TTCTAGATTT AGTAAGGAAG GC ACTAAAAA CTTTAAATTA TGACTTTGAC 
257651 CACCTTGTAG AAATGGAATC CGATGCAGGA TTAGGAAATG GTGGTTTGGG 
257701 GAGACTGGCA GCTTGTTACT TGGATTCTAT GGCTACATTA GCAGTTCCAG 
2577 51 CCTACGGCTA CGGTATACGC TATGATTATG GTATTTTTGA TCAGAGGATC 
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257 801 GTCAACGGGT ATCAAGAGGA 

2 57 851 TCCTTGGGAA ATCTGTAGGG 

257901 GAAGGGTCAT TCATTATACC 

257951 GTCGATACCC AAGAGGTATT 

258001 GTACGGTAAT GATACTGTAA 

258051 CGCGAGGCTT TGAATTCAGC 

258101 ATAGAAGATA TCGCCTTGAT 

258151 TGATTCTATT ACTGAGGGGC 

258201 TAGTTTCAGC AACCATTCAA 

258251 ATTTGTTTGG ATAACCTTGC 

2583 01 CCATCCCGCT TTAGGGATTG 

2583 51 AAGAATTACC TTGGGATAAG 

258401 TATACCAATC ATACAATCCT 

258451 TTTATTCTCT AAGTTATTAC 

258501 ATTCCCGTTG GTTAGAAAAA 

258551 AAGCGCCGGT CTTTATCCAT 

258601 TATGGCAAAC CTTGCCGTAG 

258651 CATTCCACTC TCAGCTGATT 

2587 01 TTTTTCCCTG AG.AAGTTTAT 

2 58751 ATGGATTGCT CTCTGTAATC 

258801 TAGGGGATCG TTATATCATT 

258851 TTTGCCGAAG ATAGTGGTTT 

2 58901 AAATAAGCAG GATCTAACAA 

258951 TAGACCCTAA TTCTCTCTTT 

2 59001 AAACGACAAC TAATGAATAT 

259051 TGAAAGAAAA CCCTAATCAA 

259101 GGTAAGGCGG CTCCTGGCTA 



AGCTCCTGAC GAGTGGCTAC GTTATGGAAA 
GAGAGTACCT CTATCCCGTA CGATTTTATG 
GATTCTCGAG GGAAACAGGT GGCAGATCTT 
GGCGATGGCT TATGATATTC CGATTCCTGG 
ATTCTCTAAG GCTATGGCAA GCACAATCTC 
TATTTTAACC ACGGGAACTA TATCCAGGCT 
AGAAAACATC TCTCGCGTCC TCTATCCTAA 
AGGAATTGCG TCTCAAACAA GAGTATTTTT 
GATATTATCC GCAGATATAC AAAGACACAT 
GGATAAAGTC GTAGTACAAT TAAACGATAC 
CTGAAATGAT GCATATTTTA GTCGATAGGG 
GCTTGGGAGA TGACTACAGT CATCTTTAAC 
CCCAGAGGCT TTAGAGAGAT GGCCTCTCGA 
CTCGGCATTT AGAGATTATT TATGAAATAA 
GTTGGCTCTC GCTATCCTAA AAATGATGAT 
TGTTGAAGAA GGGTATCAAA AGCGTATCAA 
TAGGTTCTGC AAAAGTAAAT GGAGTTTCGT 
AAAGATACTC TCTTTAAAGA GTTTTATGAG 
CAATGTGACC AATGGGGTGA CTCCACGACG 
CTCGTTTGAG TAAGCTTCTC AATGAAACTA 
GATCTTTCTC ATCTTTCATT GATCCGTTCC 
CCGAGATCAT TGGAAAGGGG TAAAATTAAA 
GTAGAATTTA TAATGAAGTT GGAGAAATAG 
GACTGTCATA TTAAGCGTAT TCATGAGTAT 
TTCTTAGAGT CATCTATGTT TATAATGACT 
GATGTCGTCC CTACAACAGT AATTTTTTCT 
TGTCATGGCC AAACTCATTA TCAAGTTAAT 
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2 59151 CAATAGCGTT GCTGACGTTG 

259201 TTAAGGTTCT TTTTTTACCT 

2 59251 ATTCCTGGTA CAGATCTTTC 

259301 TTCTGGAACA GGAAATATGA 

2 59351 GAACTATGGA CGGTGCAAAT 

2 594 01 AATATGTTTA TTTTTGGTCT 

2594 51 GGAATACTGT CCTCAGACAA 

259501 TTTTAGATTT GCTAGAACAG 

259551 TTTAAACCGA TAGTACATCG 

2 59601 CTTGGCTGAC TTGGAGTCTT 

259651 TCTTTAAGGA ACCAGATTCA 

259701 GGAATGGGCT TTTTCTCTAG 

259751 TATTTGGCAT GTTCCTACAA 

259801 TAGACAGGAT GGAATCAGGG 

259851 GAGTCCTTTT TGTTCAAAGA 

259901 TTCTAAGATC TTTTGTAATC 

259951 CAATCACTCT ATGATCTACA 

260001 GTAATTTCTC CGTCAAGAAC 

2 60051 AAGAATCGCC GCTTGAGGAG 

260101 CTGTCATTCC TAAGTTAGAG 

260151 TCTTGAAGAG ATTGATTTCT 

260201 TGAAATCATG CCGAGATTTT 

260251 TAATTCCATC TGGAATGGCC 

260301 CGGACGATTT TATTATCGAC 

260351 CTCTTTGAGC GCCAGAGCAC 

2 60401 GTTTGATTCC CTGAGCTTGA 

260451 GCGTAGACCT GCTGCCTTAC 



TAAATCAAGA TTCTCGAGTT AATGATAAGC 
AACTATCGAG TTTCTATGGC TGAGCATATC 
AGAACAGATT TCTACAGCTG GAATGGAGGC 
AATTTGCTTT GAATGGAGCT CTGACTATAG 
ATAGAAATGG CAGAGCATAT TGGTAAGGAG 
TTTGGAGGAG CAAATTGTAC AACTGCGGAG 
TTTGTGATAA GAATCCTAAG ATCCGTCAGG 
GGATTTTTCA ATAGCAATGA TAAAGATCTG 
CCTACTGCAT GAAGGAGATC CCTTTTTTGT 
ATATCGCTGC CCATGAAAAT GTGAACAAAC 
TGGACTAAGA TTTCTATTTA TAATACTGCA 
TGACAGAGCC ATTCAGGATT ATGCCAGAGA 
AATCTTGCTC TGGAGAAGGA AATTAAGAAA 
ACTCTTTCCA TAGCCAAGGA GCTATAGAAA 
TTGCTAGTTT AATAGTAGGA CAGCCGGAGC 
GTTTCATAAA CATCGCAGCA GGATAACCAT 
GATAGGGTAA GATTGCAGGT AGATCCTATA 
AAGAGCTTGT TCTGTAACAC TTCCTACGGC 
GATTGACAAT CGCTGTAAAT TCAGTGATTC 
ACACAGAAGG ACCCTCCTTT GTATTCAGTG 
TGCTTTTAAC GCTAAGCTCT TAATTTCTGC 
TACGGTCTGC GCAGCGTATA ATTGGCGTAA 
ACAGCTATCG AGATATCGAT AGTATCAAAA 
ACTGTTAAAT CCTGAATTGA TAGAAGGGAA 
AGGCACGTAC AATGCAATCG TTAATAGAGA 
AGTTCTTTGA GCAGATTAAG GAGAGGTGAG 
ATAGAAGTGA GGAATAGAGA TCTTAGCAGC 
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260501 TTGTAGGCGT GCAGCAATCA CTTCCCGAAT CGGAGAGAGA TTCTCCTCAT 

2 60551 GATAGGAACC TGGAGGCACT TCGGGAGACT CAGGATAGCC AAAACCAGCA 

260601 ATGCTTTTAG GAGGAGCTTT CTCTAAATCT TTTTTTACTA TACGTCCTCC 

260651 AGGACCACTC CCTTGAATTG ATGAGACATC TATGTTTTTC TCTTTTGCTA 

2 60701 GTTGTCTAGC TAATGGAGAG AGATTATTCG TAGTGCCTAC GTGTTTGAAG 

260751 ACTAAAGGCG AGGAGAGAGG TGGCTCTGGC TTAAAAGTTA CTGCTGTGAA 

260801 TGTTGCTGAG GCAGCTTGTG GAGTTGTTGC AGGCGAGACC TCTTCAGAAG 

2 60851 AACCTTTTGG AGATGCTTCA AGGTTAGAAG GTTCTGTCTT AGGAAGAAGT 

260901 TCTTCTAGAT TAAAGGGCTC GTTGGCTTCT GTAGAGAGTA CCGCAATAGG 

260951 GGTGCCTATA ACGATTTTCT CGCCTTCATG ACGTAAGATT TCACGAATCC 

261001 AGCCATCTTC ATTTGCTGTA TGTTCTAAAA TAGCTTTGTC TGTAGAGATC 

261051 TCTACAATGA CGTCTCCAAA ACTGACCTGA TCATTACTTT TTTTATGCCA 

261101 TTTCACTATA GTGCCCACTT CCATAGTTGG AGAAAGCTTT GGCATTTTCA 

261151 ATAAGGAGAT CACAAACTTA CCTCATGACT TTTTCAATGG TATCTAAGAT 

261201 TCGGTTAACA TTAGGCAAAG TGGCCTGTTC TAAGATTTTA CTATAGGGCA 

2 61251 TAGGCGTTTC TTTTTGGCAT ACCCTTAAGG GGGGAGCATC AAGAGAATCA 

261301 AAAACATGCT CAGTAATCAG GGCAATAATT TCAGAAGAAA TCCCAGCGAA 

2613 51 GTAGTGGCCC TCTTCAATTA CAATACAGCG TGAAGTTTTT CGTACCGATG 
261401 ATAAAATTGT TGATATGTCT AAAGGTTTGA TCGTTCTTAG ATCAATAATT 

2614 51 TCTATAGACA AGCCCCAACG TTTTTTGGCT AGAGAACACG CTTCTTTTGT 
261501 AATGGAAACC ATACGGCTAT AAGTAATAAT TGTAAGGTCA TTTCCTTCTT 
261551 GAACTCTATG TGCTTTCCCA ATAGGAACGA GATATTCTTC GGTGGGGACT 
261601 TCCCCTTTTA AGTTATATTC TAGCTCGTTT TCTAAAAAAA GAACGGGGTT 
261651 ATTATTTCTG ATTGCTGATT TTAATAAGCC TTTAGCGTCG TAAGGGTTCG 
261701 AAGGGGCTAT AATAATAAGA CCTGGAATAT TAGCATACAA CGACTCAACG 
261751 CAATGAGAAT GCTGGCAAGA TACCTGGGCT GCAGCACCAT TAGGGCCACG 
261801 AAAAACTATA GGAACGGAAA ACTTCCCTCC AGTCATAAAA TGCATCTTAG 
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2 61851 CTGCATGAGA AATGATTTGG 

261901 ATAAATTCTA TAATAGGGCG 

261951 TCCAGAGAAG GCTGCTTCAC 

2 62001 CCCATTTATC TAATAAGCCT 

262051 CCAACCTCTT CACCAAGAAT 

262101 GTCAATTGCT TCTCGGAGAG 

2 62151 GC AT AGACTC CTTCCTCTAA 

2 62201 TGCGTTAGAG AACGCTTCTA 

2 622 51 TTTGAAATTC CTCTTCAGTC 

262301 GCTAGGACAA TAGGATCTTT 

2 623 51 TCTATATAAA TTAGGATCTG 

2624 01 GACACTCAAC TAAAACCGGA 

2 624 51 TCTCTAAATC CTAAAAGAGA 

262 501 TGCACGGATA TCGTAGGAAC 

262 551 CAACAGCACG ATTTAATGAC 

262 601 ATAAGCATTA GAGGGAGTTG 

262651 GAATACACCT TGAGCTACCG 

2 627 01 TATTTTTTTG TTCTTGATAT 

2627 51 GGAATTTGTC CTCCGACAAT 

262801 CATATGCATG GATCCTCCAC 

262851 AAAGTTCAGC AGCAATTTCT 

262901 GCGTGGC AGO GGTATGAAGA 

262951 GATTGCAGCA GTTGCTACAG 

263001 CACCCACTAG CCCTTCTAGA 

263051 CGAATCAGAA CCATCTGTTT 

263101 AAGGTCTAAG ATCCTTTCTA 

263151 TATTATAAGG TGCTGAACTA 



TCCAAGGCTA CAAAGGAAAA GTTCCAGCTC 
CAGGCCTGAC AATGCGGCTC CTATTCCAAT 
TAATAGGAGC ATCAATGACT CTCTTAGGGC 
TTGGTGACTT TATAAGCACC ATTGTAGTCA 
ACAGACATTA GGATCGCGAG ACATCTCTTC 
CTTCTCGAAT TTCTAATGTT TTATGTTTAG 
TGTGGTGACG GATGGATCTG ATGAGAGTTT 
AAACAGCAGT TTTGCATTCT TGGCGTATAT 
AGAACCTCTA ATCGAATTAG CCAATCTTTA 
TTTAAATAAA CACTGCATTT CTTCTTTCGA 
ATATAGAATG CCCTCGAAAT CGGGAGCAGA 
GATTCGGTAT CAACCATATA GCGATAAGCC 
GTTAAATAGA TCAAAACCAT TGACTGTGAC 
TTCCTTGAGA CTCTGCTATG GGCTGTTTTG 
GTTCCCATAC TCCAGCCGTT ATTTTCAATA 
GTGAAGAGAA ACAAAGTTCA GAGTTTCATG 
CACCATCTCC GATAAAGCAT AGAGAAACTC 
TTGATGGTAA ATGCGGCTCC AGCTGCGAGG 
ACCAAATCCT CCAGGGAAAT TAGGCCCACA 
GACCTAAAGC GCATCCAGTT TCTTTCCCTA 
TGAAGGGGAA TGTTGAGAAG AATCGCAAGT 
GAACACCCAG GGATCTAGTC CTGTGTTTGC 
CTTCTTGGCC AGCGTAAGAG TGGTAAAATC 
TAGGCTTCTT CTCCTCGGGC TTCGAATTCA 
TAAAAATTTA ATACAGGAAG CGGGCCCGTA 
CTGTGGATTT CTCTGTGCCC TGAGAAGCTA 
TCCATAACTT TTTTATAAGG GAGACGCTTC 
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263201 GGGAGCGGTT 

263251 GCTATCCGAA 

2633 01 TAATATGTTT 
263351 TGGACGTTTC 

2634 01 ATAGACGGTG 
263451 TCGAGATCAT 
2 63501 TTGTTGCTTC 

263 551 TACTACTTAA 
263601 GCAAACTAAG 
2 63651 CTCAGTTATT 
263701 ACCGAAGTAA 
2 637 51 ACTGTGTTGC 
263801 ATAAAAAGCT 
263851 CTACGTTATT 
263901 CATGAGTGAG 
263951 TTGCTTTCAC 
264001 T ATGTG AAAC 
264051 TATTTATGAT 
264101 TTGCGTTGAG 
264151 GTAGATTTTT 
264201 AGAGCACACT 
264251 CTAATCTGGA 
264301 CGGACAGGGA 

264 3 51 TCTCATGATA 
2644 01 TTTTTTTTG A 
264451 AGGAATCTCA 
264 501 ACTTACTTAG 



TTTGATCTCA TTATAGGAGC 
GAAAAAATAA TTAAGTTTTT 
CTCGTGGCAA TTTAAAGAGA 
TCGTAAAATT AATCGACACA 
TCATCAAAAA CTTTGATCAT 
GAAGAATTAG AGGAAAAGCT 
TGCTCAAGAA TTTCAGAATC 
AAAAAACTCA ATGGTTGCCA 
GAATTGTTTG CCATGTTAAC 
TTTTTATTCT CCCGGATGTA 
TTTGTCATTT AAATGATTCT 
GGATTATTCG AACAACAGTG 
AGACCTACCG CTTCTTTTAG 
ATTTAACCTA TAGGAATATC 
CTAGGCAAGG AGCTCGGGGA 
TTTAATTTTT AAAGAGATTG 
TAATTCAAGG ACTCAAGCGT 
AATGATGTGC CTACTTTACC 
ATATAGCTTA GCGAATACAA 
CTTCATTGAA GTTTATAAGC 
GCAAAAGCTT TAAACTCTGG 
TGAATTCAAT CTTGGAATGA 
AGATTAGCCC AGAAATTTTA 
AAAAGAAGAG TACGTTCTTT 
TTTCAAAACT AATAAAATAA 
GCAAGGCTGG GAGTCGATAG 
TCTTTCTTCA GTTTTAGGAA 
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CTAGGTTTCA AAAGACAAGG 
TTAAAAAATC GAAAAAATTT 
ACAAAAAGGA TTACATCACA 
CTCAGTTTTA CGTGGATTCG 
AAGCCTAGTG AAGATAAATC 
TTTAACTATA ACAAAACGTA 
GTAAGACGGA CTCTAAAAAC 
TTCAAAAATG AAGAGTTGGA 
TTCGATGGAT AAAAAAATAG 
GTAGCGACTG GGTAGAATTT 
ATAGGTTTGG GAGGGGTTTT 
TGAGCATGTT GTTACTGTGA 
GAACTACCGT TGTAAATAGT 
TCTCTTTTGA ATTGTCAGAG 
TGTTTTAAAG CAACATGGAG 
TGGATATAGA TTTATTAAAC 
AGTGGAAATA TTCAAGCACG 
TTCAGTTTCA TCGAGTCCTA 
TTCGGGGCCT TGCTTTACAT 
CCGTCTATAC TTTCCAATAC 
CGGAGAATGT TTTATCTTTT 
AAATAGTCAT GCAACTATTG 
AATAAAAATA TCATGAAAAT 
ATATATTTAA TTATTTGGTC 
ATAGAATTTT AGATCTCTGA 
ATCTCTTACT TGTTTTTCTA 
GGTTCCGAAT TTTAGCAATC 



wo 00/27994 ^ PCT/US99/26923 

264,551 AACCGATGTG TTTCTTGATA AGGTCGTGCT GGAGCGCCTC CATAAATGCC 

. 264601 TGGAGAGGTG ATAGATTTTG TGACTCCAGT TTGAGCAATC ATGATCACAT 

264651 GGTCTGCAAT AGAAATATGC CCAGTAATTC CGGTTTGCCC' TCCAATGATG 

2647 01 ACATGTTCAC CAATTTTTGT AGAACCTGCA ATGCCTGCTT GGGCAACAAT 

2647 51 AATACTATGC TTTCCAATTT CTACGTGATG AGCTACTTGT ACTTGGTTAT 

264801 CTATTTTAGT TCCTTCATGG ATCACGGTGT TCTTGAATCG ACCACGATCT 

2 64 851 ATCGTAGTGT TGGCTCCGAT TTCTACATCA TCACCTACAA TCACATAGCC 

264901 TAGATGCTTT AAAGGTTTGT GATGACCAAA AGCATTTGTA ATATAACCAA 

264951 AACCACAGGA TCCTAAAACA GCTCCAGGTT GAACAACTAC ACGGTTTCCC 

265001 ATGAGGACTC TTTCTCGAAT CACCACCTTA GGGTGAATCA GACAGTTAGC 

2 65051 ACCTAGAACG CTGTGAGCTC CAATGACACT TCCAGCTCCG ATGTATGTGT 

265101 CAGAGCCGAT ATGGGCATGT TGACTAATGA CAACGTAAGG TTCTATGGTT 

2 65151 ACATTTTTCT CAATACGTGC AGTAGGATGA ATCACTGCAG TAGGATGAAT 

265201 ACCAGGAAAC CCTGATGTTA CGGGTTCAAT AAACAACTCT ATGCACTTTT 

2 65251 GAAATGTTAG AGAAGGGGAT TCATTGGTAA TAAGAAAGTT TTTCTTTAGG 

2653 01 TGGGCATGTT GCATTGCCTG AGATCTAGAT AAAATAATAG CACCAGCTTT 

2653 51 GGTGTTTTTT AGAAAGCTAG AGTATTTCTC ATTATCTAAA AAAGCAATAT 

2 65401 GGTGAGGTTG CGCCTGACTA ATATCTTCAA CACCTGAAAT AGGAGTTTCT 

265451 ATATTTCCTT GAACTTCGAC TTGTAGTAGC TCAGCTAACT GTTTAAGAGT 

265501 GTAGACTGGT GCTTCGGACA TAGAAAACTC CTTAAACTTG GACTAGTTTT 

265 551 GTTTTTTGAA AGATTCGTTA AGAATAGCAA TAATTTCGGT TGTTTTATCA 

265601 GTCCCAGGTG CTATTGCTAA GACAGGTTCT TCATTAAGGA TAGCTTCTAG 

265651 TTTTTCTTTG •GACCGCACTG ATTCTGCAGC TATTTTTACT TCTTGAATGA 

265701 GTTTTTGAAT GCGTTTTACA TGTACTTTGA TTGATAGATT GATAGTACTG 

265751 AGACTGGTAC GCATTGTACT CTCCTGAAAG ATCTTCGAAT TTCTTTCGCA 

265801 ACTCTTCAGA GGCAGAATCC GATAGGCTTT CCATGTAATC TTCATCTTGC 

265851 AACTTATTAT AAATAGAAGT GAGTTCTTCT TCTATTTTCT CAGCATTTTT 
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2 65901 TACAAACTGC TGTTTCATAG 

265951 GATCGGATTC TTCAAGACAT 

266001 TGAGCTGCGC TTGTTGATCC 

2 66051 TTTTTTCATA ATTACCTGTA 

266101 AGCCTTTTCT GAAAACAACA 

266151 ATCTTAGAAC ATGCCCCCTA 

266201 TTTTTTCTCC ATTCAAAGTC 

266251 AACATAACAG GAACATTATT 

266301 ACTACTACGT AGATCTTTTA 

2 66351 AACCTGAGTC TAAGAATACA 

266401 GGGTATTGAA ACTCTTCTGA 

266451 TTCTGTAGCA GAGTATTTTG 

266501 CTGTAGTCTC TCCACCTAGG 

2 66551 GCTGTAGTAT TGCTATAGGG 

266601 CAAAATACCT TTACGCGTAA 

2 66651 GTTTTGTAAA ATGATAAGTT 

2 66701 CCCCCGCGAA TCCCTGTAGT 

266751 ATTCAAGTTG ACACCTGCAG 

266801 TTGGCCCTAG GAGGAACTTA 

266851 CGATAAAATA GACCGTATTT 

266901 GACGTTCCCG CCATAGGTTT 

266951 TAATTGATTT ATCTAATTCA 

2 67 001 TGAGGTTTGG TCCACTTCAA 

267 051 GGCTTTTAAG AATAGATGTT 

267101 TAGAAAATAT ATTTCTAGCT 

267151 TCAATTCCTC CAAAAAGATT 

267201 TAAGTTTCCT GTTGTTGTTT 
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CTTCCAATTC TTCAGTTTCC TTTTTACCTA 
CGCTTTAAAT TAACATAGCC TAAATTTGCA 
TAAAACAAGA AGAAATGTAG AAAATAATAA 
CTAACTTGGG TATAGAAAAA GGGTACCAAA 
AAGATTTCCT TCGATAAGTC CTTAATTTAT 
AAGCAAAGAA GAATCGCTGA GATACATCAA 
TCGGTTGGAC GGAAGGGCCA ACCAAATCCT 
CATTACATCG AAGCGCAGAC CAAATCCAGC 
ACGAAATCTT ATACTCTTGT AAACCGACAA 
AAGGCACTAA TATTAGGTTG TCTGATGAGA 
AATAAGGAGC GAAGAGAGTC CTCCCTGAGG 
GACCGATAAT AAAGGATTTA TATCCCCGAA 
AAGAAGCGCT CACTGACAGG AACTCCTTCA 
TTTAATAAAT TGAGCTTCCC CTTTGATTTT 
GTTTTCTATA GATAGAGCTG TTTAAAGAGA 
CCTCCCAT^C CAGAAACCTC AAAAGTCACC 
TGGAGTTCTA GGACTATCTA CAGAATCGTA 
CAGAGACAAA TCCTTTATTG CTGTCTATAT 
CGTTTTTCAT GTAAACTCGT TTGACTTCCT 
CAGGTGTTCG TTCAAGATAT ACGTTGTGCT 
GGACAGCATA ATCTTTAGAT AATGCTCTGT 
ATTCCTAAAA TCCAAGGAGT GTTTAGAAAA 
AGTATAGTCT GTGACTTTGT CCCCGAAGTT 
CTCCACCGCC TCTTAGACAA CGAAAACCTT 
CCAAATAGAT CAAAATTACT TTCAGATAGT 
GTCAAGAGAA CTAAATCCTA AGAATAAGCC 
CTTTGACTTC TACAAAAATA TCTCGGTATT 
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267251 GATCCGCATT GCCCATAGGA TCAAGTTGAG AACGAACTGT ATAGACACTA 

2673 01 ACGCTTTGGA AGTAGCCTGT ATTTCTTAAA CGTTGCTCAG TATCTTCTAG 

2673 51 CTTTAAGCGA TTGAATGTAT CTCCTGGGAA GAGACTGGTT TCGTGTAAAA 

2674 01 TAACGTCAGA TTTTGTATGG GTATTCCCAG TAATTTTAAT TAACCCAACT 
2 674 51 TTATAAGGAG ACCCTTCACT TACCTCATAA GTTACATCAT AAATAGGGCG 
267 501 GGTTGCGTGA GGGATGAAGA GAACGTCTAC ATTGGTATTG ATGTAGCCAT 
2 67551 ACTTTGCATA AGTTTGTTTG ATCTTATGAG CCCCATCCCA TATTTTATCG 
267 601 GGGCAATAAA GATCATTGGG GCCGACTTGG GATTGCTTTT CTATAAGGCG 
2 67 651 TTTTGGCAAA ACCTCAAACC CTTGGATATG GACGTGTCCT AAGGTATATC 
2 677 01 GCGACCCTCG ATCAATATCC ATGTAAAGAA GAATATTCCC TTTGTCGTCA 
2 677 51 AGGTCATAGT GAGAGTTGAC TATAGCATCA GCGTACCCGT TATTATGTAG 
267 801 GTAATTCGTA ATTGCCAAGC TATCTTGTTC AACAATATCT GGGTGATAGA 
267 851 GTCCAGCTCC AGTAAACCAA CTTGTAGTTG TAGAGTGCTG CTTGGTTTGA 
267901 ATAAATTCTT GGATATCTGA TTTTTCTGAT CGAGAGATTC CTGAGAACGT 
267 951 AAGCTGTTTA ATTTTCCCGC AAGGACCTTC ATTGATTTTA ATTAAAACAT 
2 68001 CGATGTGACC TTTTTCTTGA TTGTGTTCCA GACTGTAGTC TACACTGGAT 
2 68051 GCGAAATATC CTCGCTTGAG ATAATACGTT CTTAGATCAT CAAGACCCTT 
268101 AAGAAATTTT TCTCGTTCAA AGAGATCATT ACGGTAAATT TGTAGGGTTT 
2 68151 TAAGAATTTT ATGTTCAGGA ACGACTTGAT TTCCTGAGAT ATGAATATTT 
2 68201 CGAATTGAGG GTTTAGCTAT TAGGTGAAGG GCTATGTTAG TTTTCCCTTC 
268251 AGAAAATTCT ACTTTAGGCT CAACAGAGTC GTATTCTTTA GCTAGAATTC 
268301 TCAAGTCTTC ATCAAAATCT AATTGAGAAA AAAGAGCCCC ACTTCTGGTC 
268351 TTTAATTTGG GTAAGGGATG TTTATTTGAA GCATTTTCTC CTTCCGTTAT 
268401 GATTGTGATA GAGTCTACCA CCACATGGCC TTCTTTAACT TTTTCAGTAG 
268451 AAAATAAAGT TAAAGQGGTT TGGATTAACG CTAGAATAGA TATTTGCAAG 
268501 ATAACTTTAT TTCGCATGAT GAGCATTCCC AAGAGTCTTC CCTAGATAGA 
268551 AAGCTTGTTT ATCAGATAAT GAGGTAAGCA CAAAAATAGG GACAAAGAAG 
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268601 TTTTTTAACA AGACCTTACT ACTTAGATTA TTTTTGTATT TCTAAAGAAA 

268651 TTTAAATTGT GTTTGATTTT TTCCCGTCCT AACACCGGGA TTATTACTGT 

268701 GCTCAGACTA GAGCCTGAAT ATTTCGGCGG GTTTTTTGCT TCCTTAAAAA 

268751 GCTTAATAAA CCTTGTCGCT CAAATAACTC TACTGATTAC ATGGTTTTAA 

268801 GGCAAATTCG ATAATTCACT GTGTCCTAGG GGAAAGTAAG AAAGACTAGC 

268851 ATAGAAGAGA ACCTTGTTTA CTTTTAGGAT AAGAGAGCTG CTAATAGGAG 

268901 TGTCGTCCAG AAAAAGCTCT TGCCAGTGTC CCTGAATCTA CATAATCAAA 

268951 AGATAAGCCT ATAGGAAGAC CTAAAGCTAG ACGGGAAATA TTTACAGAGA 

269001 AATGTTGTAA TTCTTGTTTT AGAAAAAGGG CAGTAGCATC TCCTTCTAAG 

269051 GTTGCATCAA TGGCTAGGAT AATTTCTTTT GGGCATAGCG TTTCTATGCG 

269101 TGATTTTAAA ATGGAGAGAC GCTCGTTTTC TATATGTTTC CCTGTAATGG 

269151 GCGATAAGAG TGAACCAAGA ACATGATAAC GTCCCTTGAA TACTTTAGAA 

2 69201 CGTTCTAGAA AGAAAACATC TTTTGGAGAA GCGACAATAC ATAGACTTTG 

269251 GTTATCTCTT TCTTCTCTAC AAAAGTGACA GTCTGCCTCT TTAGATTCTT 

269301 TGAGAGTAAA ACATAGGGGA CAGTGACTAC GCTCACTAGC AACATTATGA 

2693 51 AAAGCGTTAC CTAATATTTT TAATTGTTCG CTGTCCCAAG AGATGAGTTC 

2694 01 AAAAGCAAGT TTTTCTGCTG TTTTAAATCC AATTCCTGGA AGTTTTCGTA 
2694 51 AAAAGAAAAT TAATTTAGAT AAGTAATCTG GATATCTTGT CATAGTGATG 
269501 TGAATTTTAT TTTTACATTC GGGTCTGGGG CCTAAATTAA GGTTAGAGTA 
269551 TAAACTCTCT GAATAGTATA CTAGCTTTTT TCTTTACATG TGGTTCTCTG 
269601 TGAATAAAAA CAAAAAAGCA GCAATTTGGG CAACGGGTTC CTATTTGCCT 
269651 GAGAAAGTTC TTTCAAACGC AGATTTAGAA AAAATGGTAG ATACCTCTGA 
269701 TGAGTGGATC GTGACCAGAA CGGGGATCAA AGAGCGTCGT ATTGCTGGAC 
269751 CTCAGGAGTA CACTTCTCTT ATGGGAGCCA TCGCTGCAGA GAAAGCTATA 
269801 GCAAATGCGG GTTTAAGCAA GGATCAGATT GACTGTATCA TTTTCTCGAC 
269851 AGCAGCACCA GATTATATTT TCCCATCAAG CGGAGCTCTT GCTCAAGCAC 
269901 ATTTAGGCAT TGAGGATGTC CCTACATTTG ATTGCCAGGC GGCTTGTACT 
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269951 GGGTATTTGT ATGGTTTGTC TGTAGCTAAG GCTTATGTAG AATCAGGTAC 

27 0001 ATATAACCAT GTATTGTTAA TTGCTGCTGA TAAGTTGTCT TCTTTTGTAG 

27 0051 ATTATACAGA TCGGAATACC TGTGTGTTGT TTGGAGATGG AGGAGCTGCT 

27 0101 TGTGTCATAG GGGAGAGTCG GCCAGGATCT TTAGAGATTA ATAGGTTGTC 

270151 TTTAGGCGCA GATGGTAAGC TAGGAGAGTT ATTAAGCCTT CCTGCTGGAG 

270201 GTAGTCGTTG TCCTGCTTCT AAAGAGACTT TACAATCAGG CAAACATTTT 

27 0251 ATTGCTATGG AGGGAAAAGA AGTTTTTAAG CATGCTGTGA GACGTATGGA 

27 03 01 AACGGCAGCT AAACATTCGA TAGCCCTGGC AGGCATTCAG GAAGAGGATA 

27 03 51 TAGATTGGTT TGTACCTCAT CAAGCTAATG AAAGAATAAT AGATGCTTTA 

27 04 01 GCGAAGCGTT TTGAGATTGA TGAGTCTAGA GTGTTTAAGA GTGTACATAA 

27 0451 GTATGGAAAT ACTGCGGCCT CGTCTGTGGG CATTGCTTTG GATGAATTAG 

27 0501 TTCATACAGA ATCCATTAAG CTTGATGATT ATTTACTTTT AGTTGCCTTT 

27 0551 GGGGGCGGTT TGTCTTGGGG CGCAGTAGTT TTAAAGCAGG TCTAATAAGG 

270601 ACGATAATTT CATGAAAAAA CGTTATGCTT TTTTGTTCCC AGGACAAGGG 

27 0651 AGCCAATATG TAGGTATGGG ACAAGACCTA TATATGGAGT ATCCTGAGGT 

27 0701 TAGAGAGCTT TTTGATTTTG CTAATGAAAG GTTAGGATTT TCTCTGACTT 

270751 CAATTATGTT TGAAGGTCCT GAGGATCTTT TGATGGAAAC AGTACATAGT 

27 0801 CAGCTAGCTA TTTATCTTCA TAGCATGGCT GTGGTAAAGG TTCTATCTCA 

27 0851 GCGTTCTTCT ATTCAGCCTT CTTTAGTCTC TGGATTAAGT TTAGGGGAGT 

27 0901 ATACTGCTTT AGTTGCTTCC GATAGAATCT CCGTGCTCGA CGGCCTTGAG 

27 0951 CTTGTTAGAA AGCGTGGTCA GTTAATGAAT GAAGCTTGTA ATCAGAGCCC 

271001 AGGGGCTATG GCGGCTTTAT TAGGGCTTCC CTCTGAAGTT ATAGAGGAAA 

271051 ATATAACAAG TCTTGGTCAA GGAATTTGGA TTGCTAATTA TAATGCACCC 

271101 AAACAGCTTG TAGTGGCTGG AATAGCAGAA AAAGTAGACC AAGCGATTGA 

271151 GTTATTTCGT GATTTAGGAT GTAAAAAAGC AGTTCGTTTA AAGGTGTCTG 

271201 GAGCATTTCA TACTCCTTTA ATGCAAGTTG CTCAAGATGG CTTAGCTCCA 

271251 GACATTTATG CTTTATGCAT GAAAGATTCT AGCCTTCCCT TAGTGTCACA 
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271301 CGTGGTAGGA AAATCTTTAG TAAATACTGA AGAAATGCGA GAGTGTTTAG 

2713 51 CTCGGCAAAT GACATCACCT ACGTTATGGT ATCAGAGTTG TTACCATATC 

2714 01 GAATCAGAGG TGGATGAGTT TTTAGAATTA GGTCCAGGAA AAGTTTTGGC 
271451 TGGTTTAAAT CGCTCTATAG GGATTTCTAA ACCGATTACA AGTCTTGGTA 
271501 CTTTTGCTCA GATTGAAAAA TTCCTATCAG AGGTATGATT TGTATGGATA 
271551 TAACATTAGT AGGCAAAAAA GTTATAGTAA CTGGAGGATC TCGAGGAATT 
271601 GGACTCGGGA TAGTTAAGCT TTTTCTTGAG AACGGAGCAG ATGTAGAAAT 
271651 TTGGGGATTG AATGAGGAGC GAGGTCAGGC TGTTATAGAA AGTTTAACAG 
2717 01 GCTTGGGTGG CGAAGTTTCT TTTGCTCGTG TGGATGTGAG TCATAATGGT 
271751 GGAGTGAAAG ATTGCGTGCA GAAATTTTTA GATAAGCACA ACAAAATAGA 
271801 TATTTTGGTA AATAATGCAG GCATTACCAG GGATAATTTG TTGATGCGTA 
271851 TGTCTGAGGA CGACTGGCAA TCGGTGATTA GCACCAACTT GACTTCCTTG 
271901 TATTATACAT GTTCCTCAGT GATTCGCCAT ATGATTAAGG CGCGTTCAGG 
271951 ATCTATTATA AATGTGGCTT CTATTGTTGC TAAGATCGGT AGTGCGGGCC 
272001 AGACCAACTA TGCTGCTGCT AAAGCTGGGA TTATTGCTTT CACAAAATCT 
272051 TTAGCTAAGG AAGTAGCTGC AAGAAATATT CGTGTCAACT GCCTTGCTCC 
272101 AGGCTTTATT GAAACAGACA TGACAAGCGT GTTGAATGAC AATTTAAAAG 
272151 CTGAGTGGCT TAAGTCGATC CCTTTAGGTA GGGCTGGCAC TCCAGAAGAT 
272201 GTTGCTCGTG TGGCGTTGTT TTTAGCCTCG CAGTTATCGA GCTATATGAC 
272251 CGCGCAGACA CTGGTTGTTG ATGGGGGATT GACTTACTAA GACAATAGAA 

2723 01 GAAAGGGATT TGAAAATTTC TCTTCGAGAA CTAATTAAGT AACCGTCGAA 
272351 TAAAAAATGA TTTTTTGCGA TACTAATTCT CTTTCTCTTT GTCCCTAGGG 
272401 AATAGTGAAG TATTGTATAG TTTAAATAGT AAAAGGATAT AAGCAATGAG 

2724 51 TTTAGAAGAT GATGTAATAG CAATTATTGT TGAGCAGTTA GGAGTGGATC 
272501 CAAAAGAAGT TAATGAGAAC TCTTCTTTTA TTGAAGACTT GAATGCTGAT 
272551 AGTTTAGATT TAACAGAATT GATTATGACT TTAGAAGAAA AATTTGCTTT 
272601 TGAAATTTCA GAAGAAGATG CTGAGAAGCT TCGTACTGTC GGGGATGTAT 
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272 651 TTACTTATAT TAAGAAACGT 
2727 01 GGGTGCGTCG TAATCTTTTT 
2727 51 GCGTGCCTTT TTTTCTATAG 
272801 TTTTTAAGGT TCTCTGAACT 
272851 ATAAAGCTAC GGATGGGCAC 
272901 CTAAGAACTA GCATGCGGAC 
272951 GGGCTTATTA TTGAATAAGC 

273 001 AATTTAGAGG AGACTCTAGT 
273 051 ACAATGATAT AAAAGCTGAA 
273101 GCCAGGCTTA AATATTATCG 
273151 GGTCCATGTC TAAAGATTGG 
27 3201 CGATCGATCA AATTCATAAA 
273251 TTTT 



CAAGCTGAAC AATAAACTTT CTTATATTCT 
TAAAGCTATT GTTTTTCAAT AGTGATTACG 
AGTAGCTCAT CTAGAAAGAT TTATTTTGTC 
TGATTTGTTT AGCATAGAGC TCTAAAAAAG 
TCTTCCACAA fGTTTAGAAT TTGTCCTTTG 
TTGTGTATTT GCAGAAGCAT TGTATTCCCT 
TTTCCTCTCC AAAACAATCT AAAGGTTTTA 
TTTTCTTTAG AGATCGTAAT GTATCCTTCT 
TCCAGGTTGT CCTATAGAGA ATACATTGCT 
TTTCAGTTTT ATCGGCAATT GTTAAAAGAA 
AATATAATCG TTTTTTTTAG TAGAAAGGCG 
AAAGTTCCTT ATTCACACCA TAGTTTTAGT 
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